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ABSTRACT 

Strong gravitational lens systems with measured time delays between the multiple images provide 
a method for measuring the "time-delay distance" to the lens, and thus the Hubble constant. We 
present a Bayesian analysis of the strong gravitational lens system B1608+656, incorporating (i) 
new, deep Hubble Space Telescope (HST) observations, (ii) a new velocity dispersion measurement of 
260 ± 15kms^^ for the primary lens galaxy, and (iii) an updated study of the lens' environment. Our 
analysis of the HST images takes into account the extended source surface brightness, and the dust 
extinction and optical emission by the interacting lens galaxies. When modeling the stellar dynamics 
of the primary lens galaxy, the lensing effect, and the environment of the lens, we explicitly include the 
total mass distribution profile logarithmic slope 7' and the external convergence Kcxt! we marginalize 
over these parameters, assigning well-motivated priors for them, and so turn the major systematic 
errors into statistical ones. The HST images provide one such prior, constraining the lens mass 
density profile logarithmic slope to be 7' = 2.08 ± 0.03; a combination of numerical simulations and 
photometric observations of the B1608-I-656 field provides an estimate of the prior for Koxt: O.IOIqqj. 
This latter distribution dominates the final uncertainty on Hq. Fixing the cosmological parameters at 
f^m = 0.3, r^A = 0.7, and w — —1 in order to compare with previous work on this system, we find Hq = 
7O.6I3 '2 kms^^ Mpc^^. The new data provide an increase in precision of more than a factor of two, 
even including the marginalization over Kext- Relaxing the prior probability density function for the 
cosmological parameters to that derived from the WMAP 5-year data set, we find that the B1608-I-656 
data set breaks the degeneracy between and Q\ at w = — 1 and constrains the curvature parameter 
to be —0.031 < r^k < 0.009 (95% CL), a level of precision comparable to that afforded by the current 
Type la SNe sample. Asserting a fiat spatial geometry, we find that, in combination with WMAP, 
Ho = 69.7t5;okms-iMpc-i and w = -OMtoH (68% CL), suggesting that the observations of 
B1608-I-656 constrain w as tightly as do the current Baryon Acoustic Oscillation data. 
Subject headings: cosmology: observations — distance scale — galaxies: individual (B1608-I-656) — 
gravitational lensing: strong — methods: data analysis 



1. INTRODUCTION 

The Hubble constant {Hq, measured in units of 
kms"""^ Mpc~^) is one of the key cosmological parame- 
ters since it sets the present age, size, and critical density 
of the Universe. 

Methods for measuring the Hubble constant include 
Type la superno vae (SNe la) (e.g. iTammannI 119791 : 
Riess et all I2009D, the Sunyaev-Zerdovich eff'ect (e.g. 
Sunvaev fc Zel'dovichlfigSOl: iBonamente et al.ll20Q6[ ). the 

Based in part on observations made with the NASA/ESA Hub- 
ble Space Telescope, obtained at the Space Telescope Science In- 
stitute, which is operated by the Association of Universities for 
Research in Astronomy, Inc., under NASA contract NAS 5-26555. 
These observations are associated with program GO-10158. 

^ Argelander-Institut fiir Astronomic, Auf dem Hiigel 71, 53121 
Bonn, Germany 

^ Kavli Institute for Particle Astrophysics and Cosmology, Stan- 
ford University, PO Box 20450, MS 29, Stanford, OA 94309, USA 

^ Department of Physics, University of California, Santa Bar- 
bara, CA 93106-9530, USA 

* Department of Physics, University of California at Davis, 1 
Shields Avenue, Davis, CA 95616, USA 

Max-Planck-Institut fiir Astrophysik, Karl-Schwarzschild-Str. 
1, 85741 Garching, Germany 

" Kapteyn Astronomical Institute, P.O. Box 800, 9700AV 
Groningen, The Netherlands 

Sloan Fellow, Packard Fellow 

Electronic address: (Suyu@astro.uni-bonn.de] 



expa nding photosphere method for Type II super novae 
(e.g. iKirshner fc Kwan' 'l97 4t ISchmidt et a l.' '1994! ), and 
maser distances (e.g. Hcrrns tein et al.lll99 9: Macri et al.l 
120061 ). However, perhaps the two most well-known recent 
measurements come fron i the Hubble Space T elescope 
{HST) Key Project (KP) (jFreedman et al.ll200lD and the 
Wilkinson Microwave Anisotropy Probe (WMAP) obser- 
vations of the cosmic microwave background (CMB) (e.g. 
iKomatsu et~aD 12009 '). The HST KP measurement of 
Hq is based on secondary distance indicators (includ- 
ing Type la supernovae, Tully-Fisher, surface bright- 
ness fluctuations. Type II supernovae, and the funda- 
mental plane) that are calibrated using Cepheid dis- 
tances to nearby galaxies with a zero point in the Large 
Magellanic Cloud. T he resulting Hubble constant is 
72 ± 8 kms-^Mpc-i (jFreedman et al.ll2001h . We note 
that the largest contributor to the systematic error from 
the distance ladder of which this measurement depends 
is the metallicity dependence of th e Cepheid p eriod - 
luminosity relation. More recently, iRiess et al.l (|2009[ ) 
addressed some of these systematic effects with an im- 
proved differential distance ladder using Cepheids, SNe 
la, and the maser galaxy NGC 4258, finding Hq = 
74.2±3.6 kms~^ Mpc^^, a 5% local measurement of Bub- 
ble's constant. 
The five year measurement made using 
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WMAP temperature and p olarization d ata is 

iJo = 71.9^2.7 km s-iMpc-i (jDunklev et alJ 120091 ) . 
under the assumption that the Universe is flat and 
that the dark energy is described by a cosmological 
constant (with equation of state parameter w = —1). 
The uncertainty in Hq increases markedly if either of 
these two assumptions is relaxed, due to degeneracies 
with other cosmological parameters. For example, 
WMAP gives Hq 50kms"iMpc"i without the 
flatness assumption, and Hq = 74^]^^ kms~^ Mpc"-'^ for 
a flat Universe with time-independent w not fixed at 
w ^ —1. As Hq is such an important parameter, it is 
essential to measure it using multiple methods. In this 
paper, we use a single strong gravitational lens as an 
independent probe of Hq, and explore its systematic 
errors and relations with other cosmological parameters 
to provide guidance for future studies. We will show 
that the single lens is competitive with those of the 
best current cosmographic probes. Give n the current 
progress in measuring tinie dela ys (e.g., IVuissoz et al.l 
120071 120081 iParaficz et all I2009D . the methodology in 
this paper should lead to substantial advances when 
applied to samples of gravitational lenses. 

Strong gravitational lensing occurs when a source 
galaxy is lensed into multiple images by a galaxy 
lying along its line of sight. The principle 
of using strong gravitational lens systems with 
time- variable sources to measure th e Hubbl e con- 
stant is well understood (e.g. iRefsdall 11964 
[Schneider. Kochanek. fc Wambsganssl l2006l ) . The rela- 
tive time delays between the multiple images are in- 
versely proportional to Hq via a combination of angu- 
lar diameter distances and depend on the lens potential 
(mass) distribution. We refer to the combination of angu- 
lar diameter distances as the "time-delay distance" . By 
measuring the time delays and modeling the lens poten- 
tial, one can infer the value for the time-delay distance; 
this distance-like quantity is primarily sensitive to Hq 
but depends also on other cosmological parameters which 
must be factored into the analysis. The direct measure- 
ment of the time-delay distance means that gravitational 
lensing is independent of distance ladders. 

Despite being an elegant method, gravitational lensing 
has its limitations. Perhaps the most well-known is the 
"mass-she et degeneracy" between Hq and external con- 
vergence (iFalco. Gorenstein. fc Shapi ro 1985). There is 
also a degeneracy between Hq and the slope of the lens 
mass distribution, especially for lenses where th e configu- 
ration is nearly symmetric fe.g. IWucknitzll2002l ). In such 
cases, the image positions are at approximately the same 
radial distance from the lens center and so the slope is 
poorly constrained. In both cases the remedy is to pro- 
vide more information. Modeling the mass environment 
of the lens can, in principle , independently constrain 
the external conyergeii ce Ce-g.- lKeeton fc Zabludoflj[200l 
iFassnacht et al.|[2006al : Blandford et al. in preparation); 
likewise, len s galaxy stellar velocity di sp ersion measure- 
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ments fe.g.. iGroein fc Naravanlll996allbl: iTonrv &: Franxl 



1999t iKoopmans fc Treull2002l:lTreu fc Koopmansll2002t 



Barnabe fc KoopmansI 120071 : iMcKean et al.l l2009l) and 
analysis of any extended images (e.g.. iDve fc WarrenI 
[2005t IDve et al.l[2008h can constrain the mass distribu- 
tion slope. 

A measurement of Hq to better than a few percent 



precision would provide the single most useful comple- 
ment to results obtained from stud i es of the CMB fo r 
dark energy studies (e.g. IHuI 120051: iRiess et al.l I2009D . 
Dark energy has been used to explain the accelerating 
Un iverse, discovered using luminosity dista nces to SNe 
la (|Riess et al.lll998l : IPerlmutter et al.lll999t ). Efforts in 
studying dark energy often characterize it by a constant 
equation of state parameter w (where w = —1 corre- 
sponds to a cosmologic al constant) and assum e a flat 
Universe. These include fPerlmutter et al.l ()1999l) . who in 
their Figure 10 constrained w < —0.65 for present day 
matte r density values of J7ni > 0.2, and lEisenstein et al.l 
(|2005| ). who combined their angular diameter distance 
measurement to z = 0.35 from Ba ryon Acoustic Oscil- 
lations (BAG) with WMAP data (ISpergel et al.'l 2007D 
to obtain w = —0.80 ± 0.18. Recently, [Komats u et al.l 
(|2009t) measured w = -0.992;°°°^ by combining 
WMAP 5-year results (WMAP5) with obs ervations of 
SNe l a dKow alski et al. 2008) and BAG (Pe rcival et"all 
|2007( ) . iKomatsu et al.l (|2009ll also explored more general 
dark energy descriptions. In our study, we combine the 
time-delay distance measurement from B1608-I-656 with 
WMAP data to derive a constraint on w, and compare 
the constraining power of B 1608-1-656 to that of other 
cosmographic probes. 

In this paper, we present an accurate measurement of 
Hq from the gravitational lens B1608-I-656. A compre- 
hensive lensing analysi s of the lens syste m is in a com- 
panion paper (Paper I: ISuvu et al.ll2009t ). Using the re- 
sults from Paper I, we focus in this paper on techniques 
required to break the mass-sheet degeneracy in order to 
infer a value of Hq with well-understood uncertainty. We 
then explore the influence of this measurement on other 
cosmological parameters. 

The organization of the paper is as follows. In Section 
[21 we briefly review the theory behind using gravitational 
lenses to measure Hq, include a description of the mass- 
sheet degeneracy, and describe the dynamics modeling 
for the measured velocity dispersion. In Section |3l we 
outline the probability theory for combining various data 
sets and for including cosmological priors. In Section |4l 
we present the gravitational lens B1608-f656 as a can- 
didate for measuring Hq, and show the lens modeling 
results. We present the new velocity dispersion measure- 
ment and the stellar dynamics modeling in Section [SJ 
The study of the convergence accumulated along the line 
of sight to B1608-1-656 is discussed in SectionlH The pri- 
ors for our model parameters are described in Section [71 
Finally, in Section [SI we combine the lensing, dynamics 
and external convergence analyses to break the mass- 
sheet degeneracy and infer Hq from the B1608+656 data 
set. We then show how B1608-f656 aids in constraining 
flatness and measuring w when combined with WMAP, 
before concluding in Section [H 

Throughout this paper, we assume a w-CDM universe 
where dark energy is described by a time-independent 
equation of state with parameter w = P j p(? with present 
day dark energy density JIa, and the present day matter 
density is ^Im- Each quoted parameter estimate is the 
median of the appropriate one-dimensional marginalized 
posterior probability density function (PDF), with the 
quoted uncertainties showing, unless otherwise stated, 
the 16'^ and 84*'^ percentiles (that is, the bounds of a 
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68% confidence interval). 

2. MEASURING Hq USING LENSING, STELLAR 
DYNAMICS, AND LENS ENVIRONMENT STUDIES 

In this section we briefly review the theory of gravita- 
tional lensing for Hq measurement fSection l2.ip . describe 
the mass-sheet degeneracy (Section l2.2[) . and present the 
dynamics modeling (Section 12.31) . Readers familiar with 
these subjects can proceed directly to Section |3l 

2.1. Theory of gravitational lensing 

For a strong lens system in an otherwise homogeneous 
Robertson- Walker universe, the excess time delay of an 
image at angular position 6 = {9i, 62) with corresponding 

source position (3 = (/3i,/32) relative to the case of no 
lensing is 



tie, 13) 



c D. 



ds 



(1-f Zd)0(e,/3), 



(1) 



where is the redshift of the lens, (p{d, f3) is the so-called 
Fermat potential, and _Dd, Dg, and D^s are, respectively, 
the angular diameter distance from us to the lens, from 
us to the source, and from the lens to the source. The 
Fermat potential is defined as 



0(0, /3) 



(2) 



where the first term comes from the geometric path dif- 
ference as a result of the strong lens deflection, and the 
second term is the gravitational delay described by the 
lens potential ip{9). The scaled deflection angle of a light 
ray is d{9) = Wtp{9), and the lens equation that governs 
the deflection of light rays is (3 = 9 — d{9). 
The projected dimensionless surface mass density k{9) 

is 

«w = Wm. (3) 



where 



k{9)^ 



with 
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ds 



(4) 



and I](Z3d^) is the physical projected surface mass den- 
sity. 

The constant coefficient in Equation ([1]) is proportional 
to the angular diameter distance and hence inversely pro- 
portional to the Hubble constant. We can thus simplify 
Equation ([ij to the following: 



ti9,P): 
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0(0, /3) 



cx— 0(0,/3), 

-no 



(5) 
(6) 



where D/\t = (1-1- Z(j)-Dd^s/^ds is referred to as the 
time-delay distance. 

Therefore, by modeling the lens potential and 
the source position {(3), we can use time-delay lens sys- 
tems to deduce the value of the Hubble constant, and 
indeed the other cosmological parameters that appear 
in I?At- In this way, strong lensing can be seen as a 
kinematic probe of the universal expansion, in the same 



general category as SNe la and BAO. Since the princi- 
pal dependence of £>At is on Hq, we continue to discuss 
lenses as a probe of this one parameter; however, we shall 
see that the other cosmological parameters play an im- 
portant role in the analysis. 

Gravitational lens systems with spatially extended 
source surface brightness distributions are of special in- 
terest since they provide additional constraints on the 
lens potential. However, in this case, simultaneous de- 
terminations of the source surface brightness and the lens 
potential are required. 

2.2. Mass-sheet degeneracy 

We now briefly describe the mass-she et degeneracy and 
its relevance to this r esearch (see e.g. iFalco et 311(19851 : 
[Schneider et al.|[2Q06l for details). As its name suggests, 
this is a degeneracy in the mass modeling correspond- 
ing to the addition of a mass sheet that contributes a 
convergence and zero shear (and a matching scaling of 
the original mass distribution) which leaves the predicted 
image positions unchanged. A circularly symmetric sur- 
face mass density distribution that is uniform interior 
to the line of sight is one example of such a lens. Sup- 
pose we have a lens model Kmodei(^) that fits the ob- 
servables of a lens system (i.e., image positions, flux ra- 
tios for point sources, and the image shapes for extended 
sources). A new model described by the transformation 

'«trans((^) = A + (1 — A)fi;modoi(^), whcrc A is a constant, 
would also flt the lensing observables equally well. The 
parameter A corresponds physically to the convergence 
of the sheet. Since we might think of including exactly 
such a parameter to account for additional physical mass 
lying along the line of sight, or in the lens plane to model 
a nearby group or cluster, it is clear that the mass-sheet 
degeneracy corresponds to a degeneracy between this ex- 
ternal convergence (Kcxt) and the mass normalization of 
the lens galaxy.^ 

Despite the invariance of the image positions, shapes 
and relative fluxes under a mass-sheet transforma- 
tion, the relative Fermat potential between the im- 
ages changes according to A0tians(^, /3trans) = (1 - 
A)A0modci(^, /^modci)- Therefore, given measured relative 
time delays At, which are inversely proportional to Hq 
and proportional to the relative Fermat potential (Equa- 
tion [6]) , the transformed model Ktrans would lead to an 
Hq that is a factor (1 — A) lower than that of the ini- 
tial Kmodci (for fixed Q,ai, i^A, and w). In other words, if 
there is physically any external convergence Koxt due to 
the lens' local environment or mass structure along the 
line of sight to the lens system that is not incorporated 
in the lens modeling, then 



rrtruc 



(1 



)H, 



model 



(7) 



This degeneracy is present because lensing observa- 
tions only deliver relative positions and fluxes. The de- 
generacy can be broken, allowing us to measure Hq, if (i) 
we know the magnitude or angular size of the source in 
absence of lensing, (ii) we have information on the mass 

^ To be specific, the prescription that we adopt for combining 
the effects of many mass sheets at redshifts Zi with surface mass 

^i{Di9)DiDi, 



densities is Kcxt 
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normalization of the lens, or (iii) we can compare the 
measured shear in the lens with the observed distribu- 
tion of mass to calibrate Hcxt- For most of the strong lens 
systems including B1608+656, case (i) does not apply, so 
circumventing the mass-sheet degeneracy requires the in- 
put of more information, either about the lensing galaxy 
itself, or its three-dimensional environment. We distin- 
guish between two kinds of mass sheets: internal and 
external. Internal mass sheets, which are physically asso- 
ciated with the lens galaxy, are due to nearby, physically 
associated galaxies, groups or clusters which, crucially, 
affect the stellar dynamics of the lens galaxy. Exter- 
nal mass sheets describe mass distributions that arc not 
physically associated with the lens galaxy and, by defini- 
tion, do not affect the stellar dynamics. Typicall y these 
will lie along the line of sight to the lens f: Fassnacht et al.l 
[2006a). We identify Koxt as the net convergence of this 
external mass sheet. 

Two methods for breaking the mass-sheet degeneracy 
are then: 

i. Stellar dynamics of the lens galaxy. Stellar dynam- 
ics can be used jointly with lensing to break the 
internal mass-sheet degeneracy by providing an es- 
timate of the enclosed mass at a radius different 
from the Einstein radius, which is approximately 
the ra dius of th e lensed images from the lens galaxy 
(e.g.. Grogin fc Naravan 1996a, b; Tonrv & Franx 
'1999^ 'Xo opmans fc Trcu 2002; Trc u fc KoopmansI 
2002: Barnabe fc Koopmans 2007 ). We note that 
for a given stellar velocity dispersion, there is 
a degeneracy in the mass and the stellar orbit 
anisotropy (which characterizes the amount of tan- 
gential velocity dispersion relative to radial disper- 
sion). Nonetheless, the mass-isotropy degeneracy 
is nearly orthogonal to the mass-sheet degeneracy, 
so a combination of the mass within the effective 
radius (from the stellar velocity dispersion) and the 
mass within the Einstein radius (from lensing) ef- 
fectively breaks both the mass-isotropy and the in- 
ternal mass-sheet degeneracies. We describe how 
this works within the context of our chosen mass 
model in Section [2731 below. 



ii. Studying the environment and the line of sight to 
the lens galaxy. Observations of the field around 
lens galaxies allow a rough picture of the projected 
mass distribution to be built up. Many lens galax- 
ies lie in galaxy groups, which can be identified 
either by their spectra or, more cheaply (but less 
accurately), by their colors and magnitudes. By 
modeling the mass distribution of the groups and 
galaxies in the lens plane and along the line of 
sight to the lens galaxy, one can estimate the ex- 
tern al convergence Kext at the redshift of the lens 
(e.g.'Momcheva e t alll2006l : llassnacht et aLll2006al : 
[Auger et al. 2007, and references therein). The 
group modeling requires (i) identification of the 
galaxies that belong to the group of the lens galaxy, 
and (ii) estimates of the group centroid and veloc- 
ity dispersion. A num ber of recipes can be fol - 
lowed. For example, Keeton fc Zabludofj (j2004l ) 
considered two extremes: (i) the group is described 
by a single smooth mass distribution, and (ii) 
the masses are associated with individual galaxy 



group members with no common halo. The real- 
istic mass distribution for a galaxy group should 
be somewhere between these two extremes. The 
experience to date is that modeling lens environ- 
ments accurately is very d ifficult, with uncertain- 
ties of 100% ty pic a l (e.g. iMomcheva et al.l 120061 : 
iFassnacht et al.|[200 6al. In Section [HI we describe 
an alternative approach for quantifying the exter- 
nal convergence in a statistical manner: ray-tracing 
through numerical simula tions of large-scale struc- 
ture (jHilbert et al.ll200"7l ). In this section we also 
present a first attempt at tailoring the ray-tracing 
results to our one line of sight, using the relative 
galaxy number counts in the field. 

We emphasize that the mass-sheet degeneracy is sim- 
ply one of the several parameter degeneracies in the lens 
modeling that has been given a special name. When 
power-laws (k ^ bR^~^ , where R is the radial distance 
from the lens center, b is the normalization of the lens, 
and 7' is the radial slope in the mass profile) are used to 
describe the lens mass distribution, one often finds a Hq- 
7' degeneracy in addition to the Ho-b-Kext (mass-sheet) 
degeneracy (for fixed flm, and w; more generally, D^t 
would be in place of Hq). These two degeneracies are of 
course related via Hq. The Hq-"/' degeneracy primarily 
occurs in lens systems with symmetric configurations due 
to a lack of information on 7'. In contrast, lens systems 
with images spanning a range of radii o r with extended 
images provide information on 7' (e.g. IWucknitz et al.l 
2OOJ; Dy_e et al. 2008), and so the Hq-j' degeneracy is 
broken. Nonetheless, the HQ-b-Kcxt degeneracy is still 
present unless we provide information from dynamics and 
lens environment studies. 

2.3. Stellar dynamics modeling 

In order to model the velocity dispersion of the stars 
in the lens galaxy, we need a model for the local grav- 
itational potential well in which those stars are or- 
biting. This potential is due to both the mass dis- 
tribution of the lens galaxy, and also the "internal 
mass sheet" due to neighboring groups and galaxies 
physically associated with the lens, as described in 
the previous subsection. Recent studies such as the 
Sloan Lens ACS Survey (SLAGS) and hydrostatic X- 
ray analyses found that the sum of these internal com- 
ponents can be well-described by a p o wer law (e.g. 
n^reu et al. 2006; Koopmans et al. 2006; Gavazzi et 
12007; Koopmans ct al. 2009; Humphrey & Buotc 2009). 
With this in mind, we assume that the total (lens plus 
sheet) mass density distribution is spherically symmetric 
and of the form 



Plocal = PO 



(8) 



where 7' is the logarithmic slope of the effective lens den- 
sity profile, and po'^o is the normalization of the mass 
distribution that is determined quite precisely by the 
lensing, up to a small offset contributed by the external 
convergence Kcxt- This normalization can be expressed 
in terms of observable or inferrable quantities as we show 
below. 

By integrating piocai within a cylinder with radius given 
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by the Einstein radius i?Ein, we find 

/•-Re 

Mocal=47r / dz 



s ds 







-Paro 



(s2+z2)7'/2 



r(l) 



(9) 



(10) 



However, the mass responsible for creating an Einstein 
ring is a combination of this local mass and the external 
mass contributed along the line of sight, so the mass 
contained within the Einstein ring is 



M, 



local 



+ Mo: 



(11) 



where MEin is the mass enclosed within the Einstein ra- 
dius i?Ein that would be inferred from lensing,^" given 



by 



M^in — 7ri?Ein^cr, 



and Mext is the mass contribution from k,q 



nRi 



(12) 



(13) 



Combining Equations (HU]), (HH), (O, and (US]), we find 



PorH = (kc 



l)Sc,i?Si^ 



r(l) 



Substituting this in Equation 



Plocal — (^cxt l)^cr-REii] 



7ri/2r(X^) 
|), we obtain 



r(l) 



1 



7ri/2r(2^) rT' 



(14) 



(15) 



Spherical Jeans modeling can then be employed 
to infer the line-of-sight velocity dispersion, 
cr^{j',Kc^t,l3i,ni,^ni,^A,w), from piocai by assum- 
ing a model for the s tellar distribution p^, (e.g., 
iBinnev fc Tremaini I1987D . Here, /3ani is a general 
anisotropy term that can be expressed in terms of an 
anisotropy radius parameters for the stellar velocity 
ellipsoid, '"ani i in the Os ipkov-Merritt formulation 
(lOsiDkovlllQTl lMerrittlll98l : 



where 



(16) 



= is pure radial orbits and rani -> oo is 



isotropic with equal radial and tangential velocity disper- 
sions. The dependence of tr^ on J7„i, ^^A, and w enters 
through Scr and the physical scale radius of the stellar 
distribution, but the dependence on Ho dro ps out. 

We now foUow lBinnev fc Tremain^ (|1987[ ) to show how 
the model velocity dispersion is calculated. The three- 
dimensional radial velocity dispersion is found by solv- 
ing the spherical Jeans equation 



1 d(p*f7, ) _^ 2^ 



iCr,. 



GM{r) 



dr 



(17) 



where M(r) is the mass enclosed within a radius r for 
the total density profile given by Equation (fS]) and with 
p* given by the Hernquist profile (|Hernauistill990[ ) 



P*{'r) 



27rr(r -|- aY 



(18) 



By definition, -REin is the radius within which the total mean 
convergence is unity. 



where the scale radius a is related to the effective radius 
^eff by a = 0.551reff and Iq is a normalization term. The 
solution to Equation P7)) is 



/r-L 2Fi[2 + y,y;3 + y;T^] 
1^ a2 {2 + Y){r/a+l)^+y 



(19) 



where 2F1 is a hypergeometric function. The model 
luminosity-weighted velocity dispersion within an aper- 
ture A is then 



Pn2 



(^0 



^^[lYi{R)al*V]RARAe 
J^[Iii{R)*V]RdRde ' 



(20) 



where Jh (R) is the projected Hernquist distribution 
()Hernauistill990[) , both integrands are convolved with the 
seeing V as indicated, and the theoretical (that is, before 
convolution and integration over the spectrograph aper- 
ture) luminosity-weighted projected velocity dispersion 
(Tg is given by 



roc r>2 

Iu{R)al = 2 (l-/3a„i — 



i?2 p^a'^rdr 



i?2 



(21) 



The use of a iJaffd (|1983D stellar distribution function 
follows the same derivation. 

In the next section, we present the probability theory 
for obtaining posterior probability distribution of by 
combining the lensing, dynamics and lens environment 
studies. 

3. PROBABILITY THEORY 

We aim to obtain an expression for the posterior prob- 
ability distribution of cosmological parameters Hq, rim, 
SI A, and w given the various independent data sets of 
B1608-I-656. 

3.1. Notations for joint modeling of data sets 

We introduce notations for the observed data and the 
model parameters that will be used throughout the rest 
of this paper. 

We have three independent data sets for B1608+656: 
the time delay measurements from the radio obser- 
vations of the four lensed images A, B, C and D 
(|Fassnacht et all Il999l I2002D . HST Advanced Camera 
for Surveys (ACS) obs ervations associat ed with program 
10158 (PLFassnacht; iSuvu et al.l |2009[) . and the stel- 
lar velocity dispersion measurement of the primary lens 
galaxy Gl (see Section[5]). Let At be the time delay mea- 
surements of images A, C and D relative to image B, d 
be the data vector of the lensed image surface brightness 
measurements of the gravitational lensed image, and a 
be the stellar velocity dispersion measurement of the lens 
galaxy. 

As shown in Section l2.ll information on Hq, Q^, f2A, 
and w comes primarily from the relative time delays be- 
tween the images, which is a product of the time-delay 
distance D^t and the Fermat potential difference. The 
Ferniat potential is determined by the lens potential and 
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the source position that is given by the lens equation. 
Therefore, the first step is to model the lens system using 
the observed lensed image d. In order to model the lens 
mass distribution using the extended source information, 
we need to model the point-spread function (PSF) B, im- 
age covariance matr ix Cn, lens galaxy light Z, and dust 
K (if present) (e.g. ISuvia et al.l l2009D . We collectively 
denote these discrete models associated with the lensed 
image processing as TWd = {B, CD,i,K}. We explored 
a representative subspace of models Md in Paper I, us- 
ing the Bayesian evidence from the ACS data analysis to 
quantify the appropriateness of each model tested. Given 
a particular image processing model, we can infer the 
parameters of the lens potential and the source surface 
brightness distribution from the ACS data d. The data 
models are denoted by Mj = M2, ■ ■ ■ , Mn for Models 
2-11 in Paper I. 

The lens potential can be simply parametrized by, for 
example, a singular power-law ellipsoid (SPLE) with sur- 
face mass density 



el 



l(l-7')/2 



(22) 



where q is the axis ratio, h is the lens strength that de- 
termines t he Einstein rad i us (i?Ein)i and 7^ is the radial 
slope fe.g. rBarkanal[l99l iKoopmans et al.ll2003[ ). The 
distribution is then translated (with two parameters for 
the centroid position) and rotated by the position an- 
gle parameter. There is no need to include an external 
convergence parameter in the mass distribution during 
the lens modeling since we cannot determine it due to 
the mass-sheet degeneracy. Instead, we explicitly incor- 
porate the external convergence in the Fermat potential 
later on, taking into account the interplay among this 
parameter, the slope, and the normalization of the ef- 
fective lens mass distribution. We collectively label all 
the parameters of the simply-parametrized model by tj, 
except for the radial slope 7'. 

Alternatively, the lens potential can be described on a 
grid of pixels, especially when the source galaxy is spa- 
tially extended (which provides additional constraints on 
the lens potential). We focus on this case; in particular, 
we decompose the lens potential into an initial simply- 
parametrized SPLE model ?/;o(7',rj) and grid-based po- 
tential corrections denoted by the vector Stj}. The fi- 
nal potential, which is on the same grid of pixels as the 
corrections, is xp = ■0o(7','^) + f^^, where t/>o(7',T/) is 
the vector of initial potential values evaluated at the 
grid points. Furthermore, we also describe the extended 
source surface brightness distribution on a (different) 
grid of pixels by the vector s. The determination of 
the source surface brightness distribution given the lens 
potential model is a regularized linear inversion. The 
strength and form of the regularization are denoted by 
A and g, respectively. The procedure for obtaining the 
pixelated potential corrections and the corresponding ex- 
tended source surface brightness distribution is iterative 
and is described in detail in Paper I. We highlight that 
the resulting (iterated) pixelated lens potential model is 
not limited by the parametrization of the initial SPLE 
model - tests of this method in Paper I showed that when 
the iterative procedure converged, the true potential was 
reconstructed irrespective of the initial model. 



The resulting lens potential allows us to compute the 
Fermat potential at each image position, up to a fac- 
tor of (1 — Kext)- Combining the Fermat potential with a 
value of DAt computed given the cosmological parame- 
ters {i?Oi ^m, ^^A, w} provides us with predicted values 
of the image time delays, At^. 

The dynamics modeling of the galaxy is performed 
following Section 12.31 By construction, the power-law 
profile for the dynamics modeling with slope 7' matches 
the radial profile of the SPLE. Although spherical sym- 
metry is assumed for the dynamics modeling, a suit- 
ably defined Einstein radius from the lens modeling 
leads to i?Ein and Meiii that are independent of q and 
are direc tly applicable to the sp herical dynamics model- 
ing (e.g. IKoopmans et al.ll2006l ). Furthermore, the re- 
sults from SLACS based 011 spherical dynamics mod- 
eling ()Koopmans et al.l I2009D agree with those from a 
more sophisticated t wo-dimensiona l kinematics an alyses 
of six SLACS lenses (I Czoske et"aI1 [2008: Barnabe ^t al.l 
l2009f) . indicating that spherical dynamics modeling for 
B1608-I-656 is sufficient. The predicted velocity disper- 
sion is dependent on six parameters: 1) the effective lens 
mass distribution profile slope 7', 2) the external conver- 
gence Kext, 3) the anisotropy radius rani, and then the 
cosmological parameters 4) ilm, 5) J^a, and 6) w. 

By combining lensing, dynamics, and lens environment 
studies, we can break the D/^t-t^c^t degeneracy to ob- 
tain a probability distribution for the cosmological pa- 
rameters {i?Oj ^m, ^^A, w} given the data sets. In 
the inference, we assume that the redshifts of the lens 
and source galaxies are known exactly for the computa- 
tion of DAt- This is approximately true for B1608-I-656, 
which has spectroscopic measurements fo r the redshifts 
(jMvers et al.l 119951 : iFassnacht eT^ I1996D — an uncer- 
tainty of 0.0003 on the redshifts translates to < 0.2% in 
time-delay distance, and hence for fixed f^m, JIa and 
w. By imposing sensible priors on {Hq, ^^A, w} from 
other independent experiments such as WMAP5, we can 
marginalize the distribution to obtain the posterior prob- 
ability distribution for Hq. 

3.2. Constraining cosmological parameters 

In this section, we describe the probability theory for 
inferring cosmological parameters from the B 1608-1-656 
data sets. Readable introducti ons to this ty pe o f analysis 
can b e found in the books bv ISivial ()1996f ) and iMacKavl 
(|2003l ): we use notation consistent with that in Paper I. 

Our goal is to obtain the posterior PDF for the model 
parameters ^ given the three independent data sets {At, 
d, cr}: 

P(^| At, d, a) cx P(At||)P(d||)P(a|C)P(|), (23) 

where the parameters ^ consist of all the model param- 
eters for obtaining the predicted data sets described in 
Section [3J1 7', Koxt, ^'4>, -^d, J'ani, Hq, rim, ^A, 
w. For notational simplicity, we denote the cosmological 
parameters as tt = {Ho,Q^,Qa,w}. In Equation (|23| . 
the dependence on Zg and Zd are implicit. 

To obtain the PDF of cosmological parameters tt, we 
marginalize Equation (j23p over all parameters apart from 
tt: 

P(7r| At, d, (t) cx J d'-f' d^oxt dry ddip ds dM^ dr^ni • 
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likelihood 



p{Amp{d\op{<T\^)' 



P(7r, 7', Kext, r], Sip, S, Md, Tani) ■ (24) 

In the following subsection, we discuss each of the three 
terms in the joint likelihood function in turn. 

3.3. Likelihoods 

Each of the three likelihoods in Equation (|24|) 
generally depends only on a subset of the pa- 
rameters ^. Specifically, dropping independences, 
we have P{At\$,) = P(At|7r, 7', Kext, rj, ^i/^, Mq), 
P(d|0 = P(d|7',J7,Jt/,,s,MD), and P(a|0 = 

' ani } • 

For B1608+656, we can simplify and drop indepen- 
dences further in the time delay likelihood P(At|^) by 
expressing the relative Fermat potential (relative to im- 
age B for the images A, C and D) as 

A0(7', Atext, Md) = (1 - Kcxt)<z(7', Md), (25) 

and writing the i^^ (AB, CB or DB) predicted time delay 
as 

Atf = iDAt(^d,2.,7r) • A</),(7',Atext,MD) (26) 
c 

(see Appendix El for details). The resulting likelihood is 

P{At\zd, Zs, TT, 7', Kcxt, Md) 
3 

= ]^P(Af,|zd,^s,7r,7',Kext,MD), (27) 

where we assume that the three time delay 
measurements are independent, and that each 
P(At i\zd, Zs, TT, 7\/ tcxt, Mp) is given by the PDF 
in Fas snacht etaLl (2002). 

The pixelated lens potential and source surface bright- 
ness reconstruction allows us to compute 

P(d|7',r7,<5tAMP,Ai-D)=/ds P(d|7', rj, ^i/^mp, Mq) • 

Pis\X,g), (28) 

by marginalizing out the source surface brightness s. The 
most probable potential correction, Sxp^p, is the result of 
the pixelated potential reconstruction method. The like- 
lihood for the lens parameters, P(<i|7', t], Sip-^p, Mu), is 
also the Bayesian evidence of the source surface bright- 
ness reconstruction; the analytic exp ression for t h is like - 
lihood is given by Equation (19) in ISuvu et al.l ()2006l ). 
Part of the marginalization in Equation ([^^ can be sim- 
plified via 

J d6ip ds dMo P{d\-f\r],Sip,s,MB) ■ 



P{s\X,g)P{Mj,)P{S^P) 
(x~P(d|7',r7,MD =M5), 



(29) 



under various assumptions stated in Appendix |X] that are 
either justified in Paper I or will be shown to be valid 
in Section 14.21 In essence, we find that the ACS data 
models that give acceptable fits are all equally probable 
within their errors, making conditioning on (i-e., set- 
ting Mpi — M5, where M5 is Model 5 in Paper I for 



the lensed image processing) approximately equivalent to 
marginalizing over all models Md. 

Furthermore, we can marginalize out the parameters 
of the smooth lens model t] separately: 



P(7'|d, Md = Ms) oc y dr, P(d|7', rj, Md = M5) 



P 



no ACS 



(7') Piv)- 



(30) 



(See Appendix [Alfor details of the assumptions involved.) 
We see that the resulting PDF, P(7'|d, Md = M5), can 
itself be treated as a prior on the slope 7'. Without the 
ACS data d, this distribution will default to the lower 
level prior PnoACs(7')- For the rest of this section we 
refer only to the generic prior P(7'), keeping in mind that 
this distribution may or may not include the information 
from the ACS data. This will allow us to isolate the 
infiucncc of the ACS data on the final results, when we 
compare the PDF in Equation pop with some alternative 
choices of P{'j')- 

For the velocity dispersion likelihood, the predicted ve- 
locity dispersion cr^ as a function of the parameters de- 
scribed in Section [XT] is 



Cr^ = a-P(f2,„,r2A,W,7',Koxt,?'anikd,2s,?'off,PEm), (31) 

where the effective radius, res, the Einstein radius, Psin, 
and the mass enclosed within the Einstein radius, MEin, 
are fixed. The effective radius is fixed by observations, 
and Peiii and Meiii are the quantities that lensing deliv- 
ers robustly. The uncertainty in the dynamics modeling 
due to the error associated with VcB, Psin and AfEin is 
negligible compared to the uncertainties associated with 
Kext- The likelihood function for cr is a Gaussian: 



P(cr|fim, ^A, W, 7', Kcxt, J'ani) 



: exp 



{a — a 



P\2 



2al 



(32) 



Finally then, we have the following simplified version 
of Equation (|24p , where the posterior PDF has been suc- 
cessfully compartmentalized into manageable pieces: 



P(7r|At,d, a) 



: J d-y' dKc 



PiAt\zd, Zs, TT, 7', /text, Md = Ms) • 
P((T|rJm, ^A, W, 7', Koxt, ^ani) ' 

P(7')P(Atcxt)P(ra„i)P(7r). (33) 

Sections 4 to 7 address the specific forms of the like- 
lihoods and the priors in Equation p3l) . In particular, 
in the next section, we focus on the lens modeling of 
B1608-f 656 which will justify the assumptions mentioned 
above and provide both the time delay likelihood and the 
ACS P(7') prior. 

4. LENS MODEL OF B1608-(-656 

The quadruple-image gravitational lens B1608-I-656 
was discovered in the Cosmic Lens All-Sky Sur- 
vey (CLASS'I (iMvers et al.l [19951 iBrowne et all [200l 
[Mvers et al.|[2003[ ). FigurelTjis an image of B1608-H656, 
showing the spatially extended source surface brightness 
distribution (with lensed images labeled by A, B, C, and 
D) and two interacting galaxy lenses (labeled by Gl and 
G2). The redshifts of the source and the lens galaxies 
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Fig. 1.— HST ACS image of B1608+656 from 11 orbits in 
F814W and 9 orbits in F606W. North is up and east is left. The 
lensed images of the source galaxy are labeled by A, B, C, and D, 
and the two lens galaxies are Gl and G2. 1 arcsec corresponds to 
approximately 7 kpc at the redshift of the lens. 

are, respective ly, Zg = 1-394 (iFassnacht et al.|[T996[ ) and 
Zd = 0.6304 (|Mvers et al1ll995D ." We note that the 
lens galaxies are in a group with all galaxy members in 
the group lie within zli 300kms~^ of the mean redshift 
(|Fassnacht et al.|[2006aD . Thus, even a conservative limit 
of 300kms~^ for the peculiar velocity of B1608+656 rel- 
ative to the Hubble flow would only change DAt by 0.5%. 
As we will see, this is not significant compared to the sys- 
tematic error associated with Atext ■ This system is special 
in that the three relative time delays between the four im- 
ages were measured accurately with errors of only a few 
percent: AiAB = 31.5+^;o days, A^cb = Se.OtiJj days, 
and AiDB = 77.0ti;5J days (jFassnacht et al.lll999L 120021 ) . 
The additional constraints on the lens potential from 
the extended source analysis and the accurately mea- 
sured time delays between the images make B 1608-1-656 
a good candidate to measure Hq with few-percent pre- 
cision. However, the presence of dust and interacting 
galaxy lenses (visible in Figure[l]) complicate this system. 
In Paper I, we presented a comprehensive analysis that 
took into account the extended source surface brightness 
distribution, interacting galaxy lenses, and the presence 
of dust for reconstructing the lens potential. In the fol- 
lowing subsections, we summarize the data analysis and 
lens modeling from Paper I, and present the resulting 
Bayesian evidence values (needed in Equation (15111) ) from 
the lens modeling. 

4.1. Summary of observations, data analysis, and lens 
modeling in Paper I 

Deep HST ACS observations on B1608+656 in F606W 
and F814W filters were taken specifically to obtain high 
signal-to-noise ratio images of the lensed source emission. 

In Paper I, we investigated a representative sample of 
PSF, dust, and lens galaxy light models in order to ex- 
tract the Einstein ring for the lens modeling. Table [1] 

We assume that the redshift of G2 is the same as Gl. 



lists the various PSF and dust models, and we refer the 
readers to Paper I for details of each model. 

The resulting dust-corrected, galaxy-subtracted 
F814W image allowed us to model both the lens poten- 
tial and source surface brightness on grids of pixels based 
on an iterative and perturbative potential reconstruction 
scheme. This method requires an initial guess potential 
model that would ideally be close to the true model. In 
Paper I, we adopt the S PLEl+D (isotropic) model from 
iKoopmans et al.| (|2003f) as the initial model, which is the 
most up-to-date, simply-parametrized model combining 
both lensing and stellar dynamics. In the current paper, 
we additionally investigate the dependence on the initial 
model by describing the lens galaxies as SPLE models 
for a range of slopes (7' = 1.5, 1.6, . . . , 2.5). Contrary 
to the SPLEl-l-D (isotropic) model, the parameters for 
the SPLE models with variable slopes are constrained 
by lensing data only, without the velocity dispersion 
measurement. 

The source reconstruction provides a value for the 
Bayesian evidence, P{d\j' ,ri,Sxp, Md), which can be 
used for model comparison (where model refers to the 
PSF, dust, lens galaxy light, and lens potential model). 
The reconstructed lens potential (after the pixelated cor- 
rections 6ip) for each data model (PSF, dust, lens galaxy 
light) leads to three estimates of the Fermat potential 
differences between the image positions. These are pre- 
sented in the next subsection for the representative set 
of PSF, dust, lens galaxy light, and pixelated potential 
model. 

4.2. Lens modeling results 

In Paper I, we successfully used a pixelated reconstruc- 
tion method to model small deviations from a smooth 
lens potential model of B1608-I-656. The resulting source 
surface brightness distribution is well-localized, and the 
most probable potential correction Sxp-^p has angular 
structure approximately following a cos (j) mode with am- 
plitude ~ 2%. The cos 20 mode, which could mimic an 
additional external shear or lens mass distribution ellip- 
ticity, has a lower amplitu de still, indicating that the 
smooth model of Koopmans et al.l (|2003( ) — which in- 
cludes an external shear of ~ 0.08 — is giving an ade- 
quate account of the extended image light distribution. 
This was the main result of Paper I. The key ingredient 
in the ACS prior for the lens density profile slope pa- 
rameter 7' (Equation (|30p ) coming from this analysis is 
the likelihood P{d\'-f' , Mu). For a particular choice of 
slope 7' and data model Md, this is just the evidence 
value resulting from the Paper I reconstruction. In this 
section, our objective is to use the results of this analysis 
to obtain P{--f'\d) and A0(7',Kcxt), marginalizing over a 
representative sample of data models. 

4.2.1. Marginalization of the data model 

Table [T] shows the results of the pixelated poten- 
tial reconstruction at fixed density slope in the initial 
smooth lens potential model, for various data models 
My). S pecifically, we used the SPLEl-l-D (isotropic) 
model in i Koopmans et al.l ()2003D with 7' = 2.05. The 
uncertainties in the log evidence in Table [1] were esti- 
mated as ^ 0.03 X 10^ for the log evidence values before 
potential correction, and ^ 0.05 x 10^ for the log evidence 
values after potential correction. 
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We see a clear division between models with high and 
low evidence values, the two groups being separated by 
a very large factor in probability. Assuming that all the 
data models Md are equally probable a priori, the con- 
tribution to the marginalized distribution P(7r| At, d, cr) 
(Equation (|24p) from these lower-evidence models will be 
negligible. 

The physical difference between these evidence-ranked 
data models is in the dust correction: the 2-band dust 
models are found to be less probable than the 3-band 
dust models. It is useful to quantify the systematic error 
that would occur with the use of 2-band dust models 
(which was avoided from the evidence ranking) in terms 
of the Hq value implied by the system. For this simple 
error estimation we use Equation ([5]) and assert = 
0.3, r^A = 0.7, w = —1 and zer o external convergence , 
as a fiducial reference cosmology (jKoopmans et al.1l2003[ ) . 
The implied Hubble constants are shown in the final four 
columns of Table [TJ We see that the disfavored use of the 
2-band dust maps would have led to values of Hq some 
15% lower than that inferred from the 3-band maps. 

We note that the evidence values of each of the 3-band 
dust map models A^d are the same within their uncer- 
tainties. We can also see that for good data models, 
specifically TWd = M^, the three Hq values have low 
scatter: these lens models are internally self-consistent. 
Furthermore, the scatter between the values for the dif- 
ferent good data models is also low: the high evidence 
data models consistently return the same Hubble con- 
stant. This is the basis for the approximations (in Sec- 
tion [331 and Appendix El that the likelihood P(At||) 
is effectively constant with the 3-band dust map mod- 
els Mjy. Assuming that we have indeed obtained the 
optimal set of Md, we can approximate the likelihoods 
in Equations ([501) and ([55)) as being evaluated for model 
Ms. 

4.2.2. Effects of the potential corrections 

Having approximately marginalized out Md by con- 
ditioning on , we now consider the impact of the po- 
tential corrections discussed in Paper I. In particular, we 
seek the likelihood for the density profile slope parame- 
ter 7', P{d\-f' = 7;,J7,5i/jmp,Md = Ms). We charac- 
terize this function on a grid of slope values in the range 
of 7' = 1.5, 1.6 . . . , 2.5, first re-optimizing the parame- 
ters of the smooth lens model, and then computing the 
source reconstruction evidences both with and without 
potential correction. These are tabulated in Table [2] We 
again compute the Fermat potential differences and im- 
plied Hubble constant values as before. 

The spread of the three implied Ho values at fixed den- 
sity slope is again small: we conclude that the internal 
self-consistency of the lens model depends on the data 
model but not 7'. The table also shows that the smooth 
SPLE model provides a good estimate of the relative Fer- 
mat potentials. Indeed, this was the principal conclusion 
of Paper I. The relative thickness of the arcs is sensitive 
to the SPLE density profile slope 7', as can be seen in 
the first two columns of Table [2] the ev idence clearly fa- 
vors 7' ~ 2.05, as previously found bv iKoopmans et al] 
([2003) . Indeed, exponentiating this gives quite a sharply 
peaked function, which we return to below. 

How is the potential correction then affecting the 
model? In Table [2] we can see that the corrected poten- 



tial leads to nearly the same evidence value {P{d\j' — 
y^,r],SipMP, Mu = Ms)) for a wide range of underly- 
ing density slopes, and yet barely changes the relative 
Fermat potential values. The unchanging nature of the 
Fermat potential is due to the curvature type of regu- 
larization on the potential corrections suppressing the 
addition of mass within the po tential reconstruction an- 
nulus. From [Kochanekl ([2002[ ). the relative Fermat po- 
tential depends only on the mean surface mass density 
enclosed in the annulus between the images, to first or- 
der in 5R/{R), where 6R is the difference in the radial 
distance of the image locations from the effective cen- 
ter of the lens galaxies and (R) is the mean radius of 
the images. The mean surface mass density depends on 
the slope of the initial SPLE model (hence the trend 
we see in relative Fermat potential in the left-hand side 
of Table [2]), but not on the potential corrections due to 
the curvature regularization imposed. Therefore, to first 
order in 6R/{R), the Fermat potential depends only in- 
directly on 7' via the mean surface mass density. The 
second order term is very small — it has a prefactor of 
1/12 and for B1608-f656, {SR/{R))^ - 0.1. Therefore, 
for good and self-consistent data models, the potential 
corrections ^i/'mp ^o not change the Fermat potential 
significantly. 

The right-hand side of Table [21 where a wide range of 
initial slope values provide good fits to the data, is there- 
fore effectively a manifestation of the mass-sheet degen- 
eracy. One can understand the effect of the potential 
corrections as making local corrections to the effective 
density profile slope in order to fit the ACS data. The 
change in slope by the pixelated corrections would cre- 
ate a deficit/surplus of mass in the annulus, which the 
pixelated potential corrections then add/subtract back 
into the annulus in the form of a constant mass sheet 
to (i) enforce the prior (no net addition of mass within 
annulus) and (ii) continue to fit the arcs equally well. 

We conclude that the value of the potential correc- 
tion analysis is in demonstrating that the double SPLE 
model for B1608-I-656 is, despite the system's complex- 
ity, a good model for the high fidelity HST data. The 
corrections arc small in magnitude (~ 2% relative to the 
initial SPLE model), and the inclusion of the Stf) nei- 
ther significantly reduces the dispersion in implied 
values between the image pairs, nor alters the rank order 
of the data models. We therefore use the information 
on the slope of the initial SPLE model from the ACS 
data without potential corrections, thus using the infor- 
mation on the relative thickness of the lensed extended 
images clearly present. How we derive our estimate for 
P((i|7', Md) from column 2 of Table[2lis described next. 

4.2.3. The ACS posterior PDF for 7' 

In the previous section, we explored the HST data, con- 
straints on the slope parameter, optimizing the other 
parameters of the SPLE lens model at each step. To 
characterize properly P(7'|d, Md — Ms) in Equation 
l|30p . we would need to marginalize over all lens param- 
eters T] instead. However, as we shall now see, this opti- 
mization approximation is actually a good one and is 
certainly the most tractable solution due to the high 
dimensionality of the problem (16 parameters to de- 
scribe Gl, G2 and external shear). Direct sampling in 
the 16-dimensional parameter space of P(d|7', rj, Md = 
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TABLE 1 

LOG EVIDENCE VALUES AND RELATIVE FeRMAT POTENTIAL VALUES BEFORE AND AFTER THE PIXELATED POTENTIAL RECONSTRUCTION FOR 

VARIOUS DATA MODELS WITH THE SPLEl+D (ISOTROPIC) INITIAL MODEL 



Data Model 


Initial Potential 


Corrected Potential 




PSF 


dust 


log f 


Ino- P 
log f 


^0AB 






"a 


"a 


rrDB 












(xlO*) 










5 


Bl 


3- band 


1.56 


1.77 


0.244 


0.279 


0.575 


78.1 


78.1 


75.1 


77.1 ± 1.7 


9 


C 


Bl/3-band 


1.56 


1.76 


0.240 


0.280 


0.563 


76.7 


78.3 


73.5 


76.2 ±2.4 


3 


C 


3-band 


1.60 


1.76 


0.243 


0.277 


0.570 


77.6 


77.5 


74.4 


76.5 ± 1.8 


2 


drz 


3-band 


1.48 


1.75 


0.238 


0.278 


0.548 


76.0 


77.7 


71.6 


75.1 ±3.1 


7 


B2 


3-band 


1.55 


1.75 


0.237 


0.274 


0.571 


75.7 


76.7 


74.6 


75.7 ± 1.0 


11 


Bl 


no dust 


1.27 


1.72 


0.229 


0.263 


0.576 


73.2 


73.6 


75.3 


74.0 ± 1.1 


10 


Bl 


C/2-band 


1.36 


1.61 


0.193 


0.227 


0.565 


61.8 


63.5 


73.8 


66.4 ±6.4 


4 


C 


2-band 


1.40 


1.58 


0.199 


0.234 


0.560 


63.6 


65.6 


73.1 


67.4 ± 5.0 


6 


Bl 


2-band 


1.10 


1.41 


0.196 


0.226 


0.559 


62.5 


63.2 


73.0 


66.2 ±5.8 


8 


B2 


2-band 


1.23 


1.40 


0.201 


0.234 


0.556 


64.3 


65.4 


72.7 


67.4 ±4.5 










A(/> and Ho values from initial SPLEl-f-D (isotropic) 












0.243 


0.271 


0.575 


77.7 


75.8 


75.1 


76.2 ± 1.3 



Notes — The uncertainties in the log evidence before and after the potential corrections are ~ 0.03 X lO" and_~ 0.05 X lO", respectively. 
The relative Fermat potentials are in units of arcsec^, and the Hq values are in units of kms~^ Mpc~^. The values are the mean and 
standard deviation from the mean of the three estimates obtained using the initial/corrected potential and the three time delays, without 
taking into account the uncertainties associated with the time delays. These Hq values assume f2m = 0.3, = 0-7 and w = —1, and 
are listed purely to aid the digestion of the A<j) values. The full analysis for obtaining the probability distribution for the cosmological 
parameters is described in Section [S] 



TABLE 2 

LOG EVIDENCE VALUE BEFORE AND AFTER THE PIXELATED POTENTIAL RECONSTRUCTION FOR INITIAL MODELS WITH VARIOUS SLOPE USING 

PSF-Bl AND THE 3-BAND DUST MAP (Md=M0DEL 5) 





Initial Potential 


Corrected Potential 


7' 


logP 
(xlO*) 




A</,^^ 




rrAB 


rrCB 
^0 


rrDB 
^0 


Ho 


log P 
(xlO'') 


A0AB 


A</,^^ 




rrAB 


rrCB 
^0 


rrDB 


Ho 


1.5 


1.38 


0.125 


0.139 


0.287 


40.2 


39.0 


37.6 


38.9 ± 1.3 


1.73 


0.130 


0.143 


0.290 


41.7 


40.2 


38.0 


39.9 ± 1.9 


1.6 


1.48 


0.147 


0.163 


0.338 


47.2 


45.8 


44.3 


45.7 ± 1.4 


1.77 


0.150 


0.170 


0.349 


48.1 


47.6 


45.6 


47.1 ± 1.3 


1.7 


1.52 


0.174 


0.193 


0.403 


55.5 


54.0 


52.7 


54.0 ± 1.4 


1.75 


0.178 


0.201 


0.417 


57.0 


56.2 


54.5 


55.9 ± 1.3 


1.8 


1.54 


0.190 


0.211 


0.442 


60.8 


59.1 


57.7 


59.2 ± 1.5 


1.77 


0.194 


0.215 


0.457 


61.9 


60.2 


59.7 


60.7 ± 1.2 


1.9 


1.58 


0.210 


0.234 


0.491 


67.1 


65.4 


64.1 


65.6 ± 1.4 


1.76 


0.210 


0.237 


0.510 


67.3 


66.4 


66.6 


66.8 ±0.5 


2.0 


1.60 


0.229 


0.256 


0.540 


73.3 


71.6 


70.5 


71.8 ± 1.3 


1.79 


0.231 


0.261 


0.549 


73.8 


73.0 


71.7 


72.9 ± 1.1 


2.1 


1.60 


0.247 


0.276 


0.586 


79.0 


77.3 


76.6 


77.6 ± 1.2 


1.79 


0.250 


0.287 


0.606 


80.0 


80.1 


79.1 


79.8 ±0.5 


2.2 


1.58 


0.264 


0.296 


0.632 


84.5 


82.8 


82.6 


83.3 ± 1.0 


1.77 


0.258 


0.299 


0.648 


82.5 


83.7 


84.6 


83.7 ± 1.1 


2.3 


1.57 


0.281 


0.315 


0.676 


89.8 


88.0 


88.3 


88.7 ±0.9 


1.79 


0.267 


0.311 


0.678 


85.3 


86.9 


88.5 


86.9 ± 1.6 


2.4 


1.55 


0.297 


0.332 


0.720 


94.8 


92.8 


94.0 


93.9 ± 1.0 


1.79 


0.299 


0.344 


0.738 


95.6 


96.3 


96.4 


96.2 ±0.4 


2.5 


1.49 


0.312 


0.348 


0.763 


99.8 


97.4 


99.6 


98.9 ± 1.3 


1.78 


0.311 


0.357 


0.759 


99.4 


99.7 


99.1 


99.5 ±0.3 



Notes — notation and uncertainties are the same as those described 

-PnoAcs(7') Piv) in Equation ([50]) via, for example, 
Markov chain Monte Carlo (MCMC) techniques using 
the extended source information is not feasible on a rea- 
sonable time scale. Importance sampling of the prior 
PDF from the radio data of image positions and fluxes 
{PnoACs{l',v) = PnoACs{7':»7|i'adio)) by weighing the 
samples by P{d\-f' ,t], M-d = M5) is difHcult since 7' 
is effectively unconstrained by the radio data (the 
changes by < 1 in the slope range between 1.5 and 2.5)."'^^ 
It is precisely the unconstrained nature of the 7' pa- 
rameter that makes the optimization approximation so 
good. The "tube" of 7'-degeneracy traversing the 16- 
dimensional parameter space dominates the uncertainties 
in the parameters. We thus assume that the tube of 7'- 
degeneracy has negligible thickness (a degeneracy curve) , 
and use P(d|7', 77, TWd — M5) to break the degeneracy. 
Specifically, we use the radio observations, HST Near In- 
frared Camera and Multi-Object Spectrometer 1 (NIC- 

We set f'l^y = = 7' since the slope of G2 is ill-constrained 
HKoopmans et al.ll2503l ). 



in the notes for Table [T] 

MOS) images (Proposal 7422; PLReadhead), and time 
delay data to obtain the best- fitting t) for a given 7'=7,' 
(assuming ft^ — 0.3, ft a — 0.7 and w — —1 in using 
the time delay data), and compute the corresponding 
P(d|7j', r), TWd = -^5)- These are the listed evidence 
values in the second column of Table [2] for the various 7,' 
values. The time delay data are included because the pre- 
dicted relative Fermat potential among the image pairs 
using the radio and NICMOS data are otherwise incon- 
sistent with one another. The optimized parameters from 
only the radio and NICMOS data lead to 600 for 

just the time delay data; including the time delay data 
reduces the time delay x^ to ^ 1 with only a mild in- 
crease in the radio and NICMOS x^ of ~ 6. We "undo" 
the inclusion of the time delay data (so that we do not 
use the time delay data twice in the importance sam- 
pling of Equation (155]) ') by subtracting the log likelihood 
of the time delay from the log likelihood of d: the effect is 
negligible since the latter is ~ 10* higher in magnitude. 
Our thin degeneracy tube assumption implies that 
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P{d\^') ~ P{d\j\'i)), such that the posterior PDF for 
the slope is -P(7'|d) cx Pid\j') PnoAcs(7')- Assigning a 
uniform prior (i.e., -PnoACs(7') is constant), we arrive at 
the result that our desired PDF is just the exponentia- 
tion of the log evidence in column 2 of Table [2] Fitting 
these log evidences with the following quadratic function, 

logP(y|d) = g- ^^'~2°^\ (34) 

we obtain the following best-fit parameter values: 7q = 
2.081 ± 0.027, = 0.0091 ± 0.0008, and C = (1.60 ± 
0.01) X 10"*. While the PDF width ay is very small, 
the centroid is not well determined. Adding a^i and the 
uncertainty in 7q in quadrature, we finally approximate 
P{'j'\d) with a Gaussian centered on 2.08 with standard 
deviation 0.03. This provides the prior on 7' from the 
ACS data (in Equation 

The deep ACS data therefore allow a signifi- 
cant improvement to the previous measurement in 
iKoopmans et al.l (|2003D of 7' = 1.99 ± 0.20, which was 
based on the radio data and the NICMOS ring. Co- 
incidentally, our 7' = 2.08 ± 0.03 is identical, apart 
from the spread, to the measurement from SLACS of 
7' — 2.08 ± 0.2 that was based on a sa mple of massive 
elliptical lenses (jKoopmans et al.ll2009() . The spread of 
0.2 in the SLACS measurement is the intrinsic scatter of 
slope values in the sample, and is larger than the typical 
uncertainties associated with individual systems in the 
sample of 0.15. We note that our measurement is not 
the first percent-l evel determination of a strong lens den- 
sity profile slope. IWucknitz et al.l ()2004[ ) used high preci- 
sion astrometric measurements from VLBI data to con- 
strain the 7' parameter in B0218-H357 to be 1.96 ± 0.02 
(where we have transformed their /? into our notation). 
However, they did not use exactly the same model as we 
do here (instead working with combinations of isother- 
mal elliptical potentials and neglecting external conver- 
gence). iDve fc WarrenI (|200^ measured the power-law 
slope of the lens galaxy in the Einstein ring system 0047- 
2808 to be 7' = 2.11 ± 0.04 based on the ext ended image 
constraints. More recently, iDye et all (|2008l ) determined 
the power-law slope of the extremely massive and lumi- 
nous lens galaxy in the Cosmic Horseshoe Einstein ring 
system J1004-I-4112 to be 7' = 1.96 ± 0.02. 

4.2.4. Predicted relative Fermat potentials 

In order to be able to calculate the time delay like- 
lihood function, P(At|zd, 2:3, tt, 7', Kcxt, -^d), at any 
value of the slope 7', we need to interpolate the Fermat 
potential differences given in Table [2] In fact, these data 
give us the function (7(7') to insert into Equation ([25]) : 
we can do the interpolation at Kext — 0.0 and then rescale 
by (1 — Kcxt) without loss of generality. 

For each of the image pairs, we fit the relative Fermat 
potential difference as a third-order polynomial function 
of 7' using the values we have at the discrete points 
7j' for the SPLE models in the table. Recall that the 
SPLE model provides an unbiased estimate of the rel- 
ative Fermat potential, and that the various top data 
models TWd gave consistent estimates. Thus, the poly- 
nomial fit gives the function 9(7', TWd) in Equation 
The third-order polynomial fit leads to residuals 
(= {A(t>, - A0P°iy)/(A0,)) of < 1% for ah image pairs at 



all slope points in Table [2] except for 7^' = 1.7, which has 
residuals of ^ 2%. 

5. BREAKING THE MASS-SHEET DEGENERACY: 
STELLAR DYNAMICS 

In this section, we present the observations and data 
reduction for measuring the velocity dispersion tr of Gl in 
B1608-I-656. This measurement appears as the likelihood 
function given in Equation (|32p above. 

5.1. Observations 

We have obtained a high signal-to-noise spectrum 
of B1608-I-656 u sing the Low-Re solution Imaging Spec- 
trometer qRIS: lOke et al.lll995[ ) on Keck 1. The data 
were obtained from the red side of the spectrograph on 
12 June 2007 using the 831/8200 grating with the D680 
dichroic in place. A slit mask was employed to obtain 
simultaneou sly spectra for two add itional strong lenses 
in the field (jFassnacht et al.ir2006bl ) and to continue to 
probe the structure along the line of sight to the lens 
(Fassnacht et al. 2006a). The night was clear with a 
nominal seeing of 0'.'9, and 10 exposures of 1800s and 
one exposure of 600s were obtained for a total exposure 
time of 18600s. 

Each exposjure was reduced individually using a custom 
pipeline (see Auger et al. 2008, for details) that performs 
a single resampling of the spectra onto a constant wave- 
length grid; the same wavelength grid was used for all 
exposures to avoid resampling the spectra when combin- 
ing them, and an output pixel scale of 0.915 A was used 
to match the dispersion of the 831 /8200 grating. Individ- 
ual spectra were extracted from an aperture 0'.'84 wide 
(corresponding to 4 pixels on the LRIS red side) centered 
on the peak of the flux of the lensing galaxy Gl. The size 
of the aperture was chosen to avoid contamination from 
the spectrum of G2 while maximizing the total fiux for 
an improved signal-to-noise ratio. The extracted spectra 
were combined by clipping the extreme points at each 
wavelength and taking the variance- weighted sum of the 
remaining data points. The same extraction and coad- 
dition scheme was performed for a sky aperture to de- 
termine the resolution of the output co-added spectrum; 
we find the resolution to be R = 2560, corresponding to 
Cobs = 49.7kms~^. The signal-to-noise ratio per pixel of 
the final spectrum is ~ 60. 

5.2. Velocity dispersion measurement 

We use a Python-b ased implementation of the velocity- 
dispersion code from Ivan der Marell (|1994[ ). with one im- 
portant modification. Our implementation allows for a 
linear sum of template spectra to be modeled using a 
bounded variable least squares solver with the constraint 
that each template must have a non-negative coefficient. 
We use a set of templates from the INDO-US stellar li- 
brary containing spectra for a set of seven K and G gi- 
ants with a variety of temperatures and spectra for an 
F2 and an AO giant. These templates of early-type stars 
are particularly importa nt for B1608+656. which has a 
post-starburst spectrum (|Myers et al.lll995l ). 

We perform our modeling over a wide range of wave- 
length intervals and find a stable solution over a va- 
riety of spectral features; we therefore choose to use 
the rest-frame range from 4200 A to 4900 A for our 
fit. The INDO-US templates have a constant-wavelength 
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Fig. 2.— The LRIS spectrum of B1608+656 (black line) with 
a model generated from all 9 INDO-US templates and a 9th order 
continuum overplotted (red line). The gray shaded areas were not 
included in the fit, and the lower panel shows the fit residuals. The 
spectrum and our modeling suggest a central velocity dispersion of 
a = 260 ± 15kms~^, including systematic errors. 



resolution of 1.2 A which corresponds to crtompiato = 
33.6 km s~^ over this wavelength range. We iterate over 
a range of template combinations and polynomial con- 
tinuum orders and find that a variety of solutions that 
vary around 260 km s~^ with a spread of about 13 km s^^ 
and statistical uncertainties of 7.7kms~^ (see Figure 
[2]). We therefore adopt a velocity dispersion of ct = 
260± 15 km s~^, with the error incorporating the system- 
atic template mismatch and the statistical error for the 
models. This agrees wit h the previous measure ment of 
(Tap = 247 ± 35 km s~^ bv lKoopmans et al] ()2003[ ) with a 
significant reduction in the uncertainties, though we note 
that the two velocity dispersions have been measured in 
slightly different apertures. 

6. BREAKING THE MASS-SHEET DEGENERACY: LENS 
ENVIRONMENT 

In this section, we outline two approaches for quan- 
tifying the prior probability distributions of the exter- 
nal mass sheet Kext- Computing this quantity such that 
Equation ^ holds true is not a trivial matter. The 
non-linearity of strong lensing means that the surface 
mass density at a given angular position in successive 
redshift planes between the observer and the source can- 
not simply be scaled by the appropriate distance ratios 
and summed: rather, the deflection angles (which can 
be large) need to be taken into account when calculat- 
ing the distortion matrices (which contain and define the 
external convergence and shear), leading us towards a 
ray-tracing approach ()Hilbert et al.|[2009f ). Detailed in- 
vestigation of the ray paths down the B1608-I-656 light 
cone is beyond the scope of this paper, and we defer it 
to a later work (Blandford et al. in preparation). In this 
section we use the statistics of B1608-l-656-like fields in 
numerical simulations to derive a PDF for Kcxt- 

6.1. Ray-tracing through the Millennium Simulation 

Following iHilbert et all (|2007t ). we use the multiple- 
lens-plane algorithm t o trace rays through the Millen- 
nium Simulation (MS; ISpringel et al.l [20051) . one of the 
largest N-body simulations of cosmic structure forma- 
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Fig. 3. — Probability distribution for the external convergence 
Koxt along strongly lensed lines of sight from the Millennium Simu- 
lation for the lens redshift zl a-iid source redshifts 23 of B 1608-1-656 
(solid line) compared to the convergence distribution for all lines 
of sight (dotted line). 

tion.^'^ We then identify lines of sight where strong lens- 
ing by matter structures at — 0.63 occurs for sources 
at Zs = 1.39. The convergence along these lines of sight 
is estimated by summing the projected matter density 
on the lens planes weighted for a source at Zs — 1.39^ 
along the ray trajectory. By excluding the primary lens 
plane at Zd = 0.63 that causes the strong lensing, the 
constructed convergence is truly external to the lens and 
is due to the line-of-sight contributions only. By sam- 
pling many lines of sight, we obtain an estimate for the 
probability density function of Koxt from simulations. We 
denote this as the "MS" prior on Koxt- 

Figure [3] shows the predicted amount of external con- 
vergence constructed using 6.4 x 10* lines of sight (with 
and without strong lenses) to sources at Zg = 1.39: of 
these, 8.0 x 10"^ lines of sight contain strong lenses. For 
both curves, the mean Koxt is consistent with zero with 
a spread of ~ 0.04. 

How should we interpret this distribution? Accord- 
ing to its definition, Koxt could have contributions from 
galaxies on the primary lens plane that do not affect the 
dynamics. Neglecting these contributions (effectively as- 
suming that the lens is an isolated galaxy) might lead 
to an underestimate of Koxt, since most lenses are mas- 
sive galaxies that often live in over-dense environments 
like galaxy groups and clusters. However, if the local 
contribution to the external convergence is accounted for 
in the lensing plus dy namics modeling (as discussed in 
iFassnacht et alJ l2006a[ ) . then the MS PDF will give an 
accurate uncertainty in the inferred Hubble constant af- 
ter marginalization. 

Indeed, what the MS PDF also verifies is that on aver- 
age the contribution to the external convergence at a 
strong lens from line-of-sight structures is almost the 
same as that for a random line of sight, namely zero. 

The details o f the ray-tracing algorithm are described in 
IHilbert et al.1 I I2009I ). The methods for sampling lines of sight, 
identifying strong lensing events, and calculating the convergence 
are described in Hilbert et al. ( 2007). Note that we also include a 
stellar component in the ray-tracing as described in IHilbert et al.1 
([2008). 

It is beyond the scope of this paper to quantify this contribu- 
tion from our ray-tracing simulations. This would require modeling 
the lenses and their environment in a way that allows one to split 
the mass distribution into a part that is accounted for by the lens 
model (and constrained by lensing and dynamics data) and a part 
that acts as external convergence. 
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Fig. 4. — Probability distribution for the external convergence 
ftcxt obtained from combining results of galaxy number counts 
around B1608+656 with results from ray-tracing through the Mil- 
lennium Simulation. Compared are the distribution along lines 
of sight with a relative galaxy number density figai/i^igal) = 
2.00 ± 0.05 (solid line) to the distribution along all lines of sight 
(dotted line). 

The MS prior therefore suggests that ensembles of iso- 
lated strong lenses will yield estimates of cosmological 
parameters that are not strongly biased by line-of-sight 
structures. The PDF in Figure [3] gives us an idea of by 
how much individual lenses' line-of-sight K^xt values vary, 
and hence an estimate of the uncertainty on Hq due to 
this structure. In the absence of any other information, 
we can assign the Millennium Simulation PDF as a prior 
on Koxt in order to limit the possible values of external 
convergence to those likely to occur. This assignment has 
the effect of adding an additional uncertainty of ^ 0.04 
in Kcxt, with no systematic shift in Koxt. 

6.2. Combining galaxy density observations with 
ray-tracing simulations 

The prior discussed in the preceding section does not 
take into account any information about the environment 
of B1608-I-656. Here, we combine knowledge of the lens 
environment with ray-tracing to obtain a more informa- 
tive prior on the e xt ernal convergence. 

Fassnac ht et al.l ()2009t ) compared galaxy number 
counts in fields around strong galaxy lenses, including 
B1608-I-656, with number counts in random fields and in 
the COSMOS field. Among other measures, they used 
the number of galaxies with apparent magnitude 18.5 < 
WF814W < 24.5 in the F814W filter band in apertures 
of 45 " radius (300 kpc at the redshift of B1608+656) to 
quantify the galaxy number density rigai projected along 
lines of sight. They found that the distribution of rigai for 
lines of sight containing strong lenses is not very different 
from that for random lines of sight. However, B1608-I-656 
lies along a line of sight with a galaxy density rigai that is 
about twice the mean over random lines of sight, (figai)- 
A positive Koxt bias can arise through Poissonian fluctu- 
ations that are present in the number of groups along the 
line of sight in the observed sample of strong lenses. 

We can use this measurement of galaxy number den- 
sity in the B1608-I-656 field to generate a more infor- 
mative prior PDF for Kcxt- As for the MS prior in 
the previous section, we use the ray-tracing through 
th e MS together with the s emi-analytic galaxy model 
of iDe Lucia fc BlaizotI ()2007l ) to quantify the expected 
external convergence Koxt for lines of sight with a given 
relative overdensity %ai/(7^gai). Dividing out the abso- 



lute number of galaxies in the field accounts for differ- 
ences due to the particular set of cosmological parame- 
ters used by the Millennium Simulation and inaccuracies 
in the galaxy model: We assume that differences in the 
relative overdensity between the MS cosmology and the 
true one are small. 

We generate 32 simulated fields of 4 x 4 deg^ on the sky 
containing the positions and apparent magnitudes'^ of 
the model galaxies at redshifts < 2; < 5.2 together with 
maps of the convergence k to source redshift Zg = 1.39. 
The galaxy positions and magnitudes in the simulated 
fields are converted into maps of the galaxy density rigai . 
We then select all lines of sight with relative overdensity 
1.95 < ngai/(ngai) < 2.05 and compute the distribution 
of the convergence along these lines of sight. The result- 
ing convergence distribution (shown in Figure \^ is then 
used as prior distribution for the external convergence 
i^c-xt, which we denote as the "OBS" (observations and 
MS) prior. 

The convergence computed in this way is not strictly 
speaking external convergence, since (i) we do not sub- 
tract any contribution from any primary strong lens, (ii) 
we take all lines of sight and not just those to strong 
lenses. We are instead building on one of the results of 
the previous section and assume that the distribution of 
external convergences is very similar to the distribution 
of convergences along random lines of sight. 

Where this approach becomes inappropriate is where 
a ray passes close to a galaxy center, and is hence asso- 
ciated with a very large convergence. Assuming such a 
line of sight as foreground/background for a strong lens 
galaxy essentially creates a lens system with two or more 
strong defiectors. These sight lines corres pond to com- 
pound lenses such as SDSS J0946-1-1006 (jGavazzi et al.l 
I2008D . but not to B1608-I-656. However, the tail of high 
convergence values does not pose a problem here: as we 
will see in Section 18.11 below, the high external conver- 
gence is rejected by the dynamics modeling. We expect 
the mean and width of the PDF in Figure |4] to repre- 
sent well the possible values of Kcxt for a field that is 
over-dense in galaxy number by a factor of two. 

Our OBS Kext distri b ution a grees with earlier estimates 
from iFassnacht et al.l ()2006al ). who identified and mod- 
eled the 4 groups along the line of sight to B1608-I-656 us- 
ing various mass assignment recipes. In both approaches, 
we and Fassnacht et al. (2006al) are concerned primarily 
with extracting information on the external convergence 
and not the external shear. If we were to estimate the ex- 
ternal convergence by assigning masses and redshifts to 
all objects in the B1608+656 field, and then ray tracing 
through the resulting model mass distribution, the exter- 
nal shear as required in the strong lens modeling would 
serve as an important calibrator for the external con- 
vergence. Such a procedure is beyond the scope of this 
paper, and we defer it to a future publication (Blandford 
et al., in preparation). However, we do find (by comput- 
ing the distribution of external shears in MS fields with 
different external convergences) that the magnitude of 
the external shear required by the strong lens modeling 

The model galaxy catalogs do not provide F814W magni- 
tudes. We simply approximate mpguy/ by combining SDSS i-band 
and 2-band magnitudes to get m,pgi4w = Xiirii -|- (1 — Xi)mz with 
Xi = 0.5. We have checked that our results do not depend strongly 
on Xi G [0, 1] . 
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(7oxt — 0.075) is consistent with the external shear am- 
phtude predicted in the OBS scenario for the B1608+656 
field. 

6.3. The influence on lens modeling. 

As already remarked, the description of ray propaga- 
tion in an inhomogeneous cosmology is quite subtle. The 
matter (dark plus baryonic) density is partitioned be- 
tween virialized structures (galaxies, groups and clusters) 
and a depleted background medium. Any structures suf- 
ficiently close to the line of sight will imprint convergence 
and shear onto a ray congruence. Meanwhile the back- 
ground medium will contribute less Ricci focusing than 
would be present in a homogeneous, flat universe and 
will diminish the net convergence. 

As the foregoing discussion makes clear, the line of 
sight to B1608-I-656 is unusual and we know quite a lot 
about the photometry and redshifts of the intervening 
galaxies. It is therefore possible, in principle, to make 
a refined estimate of the external convergence and shear 
and to compare the former with the simulations discussed 
above and the latter with the shear inferred in the lens 
model described in Paper I. In this way, the shear, again 
in principle, can be used to calibrate Koxt- 

There is a second complication that must be addressed. 
Matter inhomogeneities in front of Gl and G2 distort the 
image of the primary lens as well as the multiple images 
of the source. Inhomogeneities behind the lens contribute 
further distortion in the images of the source. In a more 
accurate approach, these effects should be taken into ac- 
count explicitly in the construction of the lens model, 
while here we are subsuming them in a single correction 
factor Kext- The way that the resulting corrections af- 
fect the inference of a value for Hq turns out to be quite 
complex. However, it appears that in the particular case 
of B1608-I-656, the error that is incurred does not con- 
tribute significantly to our quoted errors. 

These matters will be discussed in a forthcoming pub- 
lication. 

7. PRIORS FOR MODEL PARAMETERS ^ 

A key goal of this work is to quantify the impact of 
the most serious systematic errors associated with us- 
ing time-delay lenses for cosmography. Our approach is 
to characterize these errors as nuisance parameters, and 
then investigate the effects of various choices of prior 
PDF on the inference of cosmological parameters. To 
this end, we use either well motivated priors based on 
the results of Section SI Section [6] and other independent 
studies, or, for contrast, uniform (maximally ignorant) 
prior PDFs. We now describe our choices for each pa- 
rameter in turn. 

• ^'(Tr). We consider a set of four cosmological pa- 
rameters, TT = {Ho,^m,^A,w}. We then assign 
the following four different joint prior PDFs: 

K03: uniform prior on Hq between and 
150 kms-i Mpc-i, = 0.3, = 0.7, and 
w = —1. This is the cosmology t hat was as- 
sumed in iKoopmans et al.l ()2003[ ) (the most 
recent Hq measurement from B1608-I-656 be- 
fore this work), and is the cosmology that is 
typically assumed in the literature for measur- 
ing Hq from time-delay lenses. This form of 



prior allows us to compare our Hq to earlier 
work. 

UNIFORM priors on all four cosmological pa- 
rameters, with either the w = — 1 or the fiat- 
ness (rim = 1— J7a) constraint imposed. These 
priors allow us to quantify the information in 
the B1608-I-656 data set as conservatively as 
possible. 

WMAP5: WMAP 5 year data set posterior PDF 
for {Hq, ilm, ilA, w}, assuming either w = —1 
or a flat geometry. This allows us to constrain 
either flatness or w by combining B1608+656 
with WMAP. 

WBS: Joint posterior PDF for {Hq, fl^, w} with a 
flat geometry, given the WMAP 5 data in com- 
bination with compendia of BAO and super- 
novae (SN) data sets. This allows us to quan- 
tify the gain in precision made when incor- 
porating B1608-I-656 into the current global 
analysis. 

The last two priors are deflned by the Markov 
chains provided by the W MAP team^^ ba s ed on 
the analysis performed by iDunklev et al.l ()2009f ) 
and iKomatsu efall (|2009l) . The BAO d ata in- 
corporated were taken from iPercival et aTl (j2007); 
the SN sampl e used is the "union" sample of 
iKowalskietall ()2008[ ). While the BAO and 
SN data se t s are continually improving (e.g. 
iHicken et al.l l2009| ). this particular well-defined 
snapshot is sufficient for us to explore the relative 
information content of our data set compared with 
other, well-known cosmological data sets. We also 
note that the publication of Markov chain repre- 
sentations of posterior PDFs makes further joint 
analyses like the one we present here very straight- 
forward indeed. 

• P{'y')- We consider three different prior PDFs for 
the density profile slope. In the first two priors, 
we ignore the B1608-I-656 ACS data (i.e., dropping 
P{d\-f',r],Mu = Ms) in Equation these 
first two are controls, to allow the assessment of 
the amount of information contained in the ACS 
data. 

Uniform: a maximally ignorant prior PDF, de- 
fined in the range 1.5 < 7' < 2.5. 

SLACS: This is a Gaussian prior based on the re- 
sult from the SLA GS pr oject: 7' = 2.08 ± 0.2 
(|Koopmans et al.l l2009f ). This was derived 
from a sample of low-redshift massive ellip- 
tical lenses, studied with combined strong 
lens and stellar dynamics modeling. We note 
that this was obtained without considering 
the prese nce of any ex ternal convergence Koxt ■ 
However, iTreu et all (2009) find that the en- 
vironmental effects in the SLAGS lenses are 
smaller than their measurement errors and are 
typically undetected. Since SLAGS lenses do 

' http://lcmibda.gsfc.nasa.gov 
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not require an external shear in the model- 
ing, typical Kcxt values for these lenses are ex- 
pected to be small. Only in a few extreme 
cases does the Koxt reach values of order 0.05- 
0.10. Therefore, we take directly the prior on 
the slope from SLAGS lenses without correc- 
tions for Kcxt- 

ACS: This prior is the PDF P(7'|d,MD = Mg) 
obtained from the analysis of the ACS image 
of B1608-H656 in SectionH^] This is the most 
informative of the three priors on 7', as it 
is determined directly from the B1608-I-656 
data, independent of external priors from 
samples of galaxies (e.g. SLACS). 

• P{t]). As described in Section 221 we use the ra- 
dio observations and the NICMOS F160W images 
of B1608-1-656 to constrain the smooth lens model 
parameters tj for a given slope 7'. The posterior 
PDF from this analysis forms the prior PDF for 
the current work. 

• P(Koxt)- We consider three forms of prior for the 
external convergence: 

Uniform between —0.25 and +0.25: again, such 
a maximally ignorant prior, again to provide 
contrast. 

MS: from the strong lenses in the MS, discussed 
in Section |6J] 

OBS: from the galaxy number counts in the field 
of B1608-f656 and the MS, discussed in Sec- 
tion |621 

• -PC^^ani)- For the lens galaxy stellar orbit ra- 
dial anisotropy parameter rani, we simply as- 
sign a uniform prior between 0.5rcff and 5roff, 
where r^s is the effective radius that is deter- 
mined from the p hotometry to be 0.58" ± 0.06" 
(jKoopmans et al. Tl2003.1 for the velocity dispersion 
measurement. The uncertainty in has neg- 
ligible impact on the model velocity dispersion. 
The inner cutoff of rani is motivat ed by observa- 
tions (e.g.. iKronawitter et al. 2000 ^ and radial in- 
stability arguments (e.g ., iMerritt fc Aguilail 119851 : 
IStiavelli fc Spark^ll991[ ). while the outer cutoff is 
for computational simplicity (the model velocity 
dispersion changes by a negligible amount between 
Ta.ni = 5rcff and Tani — >■ oo). Thcsc boundaries are 
consistent with those in iGebhardt et al.l (|2Q03f ) . 

These priors are summarized in Table |3l 

8. INFERENCE OF Hq AND DARK ENERGY 
PARAMETERS FROM B 1608-1-656 

In this section we present the results of the analysis 
outlined in Section |31 putting together all the likelihood 
functions and prior PDFs described in Sections |4] to [T) 
We obtain P(7r| Af, d, a) by importance sampling, using 
the two likelihoods in Equation ([33l) as the weights for 
the various priors on 7', Kcxt, ^ani, and tt listed in Table [3] 
(see Appendix IA.2I for details) . By using the likelihood 
functions of our B1608-I-656 data sets, we are incorpo- 
rating the uncertainties associated with these measure- 
ments. We expect and indeed find that the data are 



relatively insensitive to rani and do not constrain it. Fo- 
cusing first on the systematic errors now quantified as the 
nuisance parameters 7' and Kcxt, we gradually increase 
the complexity of the cosmological model to probe the 
full space of parameters. 

For each possible combination of the priors on the pa- 
rameters in Table [31 we generate 96000 samples of 7', 
i^cxt, fani, and TT to characterize the prior probability dis- 
tribution. We also have two types of stellar distribution 
functions, Hernquist and Jaffe, for modeling the stellar 
velocity dispersion; we find that the two different types 
of stellar distribution function produce nearly identical 
PDFs for the cosmological parameters. Since the pri- 
ors on the parameters play a greater role than does the 
choice of stellar dynamics model, we focus only on the 
Hernquist stellar distribution function for the remainder 
of the section. 

8.1. Exploring the degeneracies among Hq, 7' and K^xt 

To investigate the impact of our limited knowledge 
of the lens density profile slope 7' and external conver- 
gence Kext, we first fix the cosmological parameters flm, 
S^A and w according to the K03 prior. This allows us 
a simplified view of the problem, and also a comparison 
with previous work that used this rather restrictive prior. 

We first assign the OBS prior for Kcxt, and look at the 
effect of the various choices of density profile slope priors. 
The left-hand panel in Figure shows the marginalized 
posterior PDF for Hq for the three different priors for 
7' given in Table [3l From this graph, we see that the 
SLACS prior gives a similar estimate of Hq as the uni- 
form prior with a negligible increase in precision. The 
ACS prior lowers Hq relative to that of the SLACS and 
uniform priors, and improves the precision in Hq to 4.4%. 
Overall, the impact of the prior on 7' is relatively low in 
the sense that, even with a uniform prior on 7', Hq is 
still constrained to 7% (taking Hq = 70.6 as our refer- 
ence value). For the remainder of this paper, we assign 
the ACS prior. 

As expected, the prior for K^xt has a greater effect, 
shown in the right-hand panel of Figure El Taking 
the maximally informative OBS prior as our default, 
we see that relaxing this to the MS prior causes an 
increase in inferred Hq value of some 6kms~^Mpc~^, 
and relaxing further to a uniform prior increases it by 
12 kms^^ Mpc^^. The precision in Hq also drops by 
more than a factor of two from the OBS prior to the 
uniform prior. Our knowledge of Kcxt is therefore limit- 
ing the inference of Hq. 

We note that the stellar dynamics contain a signif- 
icant amount of information on Hq. The stellar dy- 
namics effectively constrain Kext and 7' to an approx- 
imately linear relation, where an increase in Atext re- 
quires a steepening of the slope in order to keep the 
predicted velocity dispersion the same. Therefore, for 
a fixed range of 7' values, the modeling of the stellar dy- 
namics would only permit a corresponding range of K^xt 
values. Specifically, without dynamics as constraints, we 
find Hq = 68.lt^;lkms-iMpc-i for the ACS and OBS 
priors. The lower bound on Hq is somewhat weakened 
by the high tail of the OBS Kext distribution. On the 
other hand, this high tail is rejected by the use of the 
dynamics data. Therefore, our tight constraint on Hq 
results from the combination of all available data sets - 
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TABLE 3 

Priors on the parameters 





uniform (1.5 < 7' < 2.5) 


SLAGS (7' = 2.08 ± 0.2) 


ACS (7' = 2.08 ± 0.03) 


-P(Kcxt) 


uniform (-0.25 < Koxt < 0.25) 


MS (Millennium Simulations; FigurelSl 


OBS (Observations and MS; Figure lU 


P(rani) 


uniform (0.5roff < Tani < 5rcff) 


P(7r) 


K03 [Um = 0.3, Ua = 0.7, w = -1, 
uniform Ho G [0, 150] kms'^ Mpc^^) 


UNIFORMopen (tu = -1, 
f!m and SIa uniform g [0, 1], 
uniform Hq e [0, 150] kms^^ Mpc^^) 


UNIFORMw (Oa = 1 - fim uniform e [0, 1], 
uniform w G [—2.5,0.5], 
uniform Hq G [0, 150] kms~i Mpc^^) 


WMAPopen 
(WMAP5 with w = -1) 


WMAPw (WMAP5 with 
flatness and time-independent w) 


WBSw (WMAP5 + BAO + SN with 
flatness and time-independent w) 



Notes — The K03 entry for P(tt) is the same prior as in lKoopmans et al.l II2003I V This is also the most common cosmology prior assumed 
in previous studies of time-delay lenses. 



OBS K^^^ with fixed Q^, 



Q^, and w 



ACS y with fixed 



Q^, and w 
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Fig. 5. — Left: the marginalized posterior PDF for Hg assuming K03 cosmology and OBS Kcxt priors. Right; the marginalized posterior 
PDF for Ho assuming K03 cosmology and ACS 7' priors. The prior uncertainty in external convergence determines the precision of the 
inferred Hubble constant. 

each data set constrains different parts of tlie parameter 
space such that the joint distribution is tighter than the 
individual ones. 

To summarize, using ah available information on 
B1608-I-656 and the ACS and OBS priors gives Hq = 
70.6 ± 3.1 kms"^ Mpc~\ a precision of 4.4%. We inter- 
pret Figure [5] as evidence that we are approaching satu- 
ration in the information we have on the lens model for 
B1608-I-656: the mass model is now so well constrained 
that the inference of cosmological parameters from this 
system is limited by our knowledge of the lens environ- 
ment. We now explore this joint inference in more detail, 
first putting it in some historical context. 

8.2. Comparison with other lensing Hq results 

What improvement in the measurement of Hq do we 
gain from our new observations of B160 8-I-656? The most 
recent measurement before this work bv lKoopmans et al.l 
(|2003f) was Hq = TSlg kms^^ Mpc'^ This result was 
based on a joint lensing and dynamics modeling using 
the radio data, shape of the Einstein ring from the NIC- 
MOS images and the earlier less precise velocity disper- 
sion measurement. Our improved analysis using the deep 
ACS images and the newly measured velocity dispersion 



reduce the uncertainty by more than a factor of two, even 
with the inclusion of the systematic error due to the ex- 
ternal convergence that was previously neglected. We 
attribute our lower Hq value to our incorporation of the 
realistically-skewed OBS Koxt- 

Let us now compare our Hq measurement based 
on the K03 cosmology to several recent measurements 
(within the past five years) from other time-delay lenses. 
Most analyses assumed = 0.3 and f^A = 0.7 

— we poin t out explicitly the fe w that did not. In 
B0218-f357. IWucknitz et"al1 (|2004D measured Ho^78± 
6 kms""'^ Mpc~^ (2(t) by modeling this two-image lens 
system with isothermal elliptical potentials (and effec- 
tively measuring 7', se e Section 14. 2. 3D b ut neglecting 
external convergence. lYork et all (|2005D refined this 
using the centroid position of the spiral lens galaxy 
based on HST/ ACS observations as a constraint; de- 
pending on the spiral arm masking, they found Hq — 
70 ± 5kms~^Mpc~^ (unmasked) and Hq — 61 ± 

7kms^^Mpc^^ (masked) (bot h with 2cr error s], In 

the two-image FBO0951-H2635. lJakobsson et al.l (|2005D 
obtained Hq = (random, Icr) ±2 (system- 

atic) kms~^Mpc~^ for a singular isothermal ellipsoid 



Cosmological constraints from gravitational lens B1608+656 



17 



model and Hq = 63ly (random, la) ±1 (systematic) 
kms^^ Mpc~^ for a constant mass-to-light ratio model, 
again ignoring external convergen ce. In the two - image 
quasar system SDSS J1650-I-4251, IVuissoz et aH ()2007l ) 
found Hq = 51.7^30 km s^^ Mpc~^ assuming a singu- 
lar isothermal sphere and constant external shear for 
the lens model. More general lens models considered 
by these authors (e.g. including lens ellipticity, or us- 
ing a de Vaucouleurs density profile) were found to 
be underconstraine d. In the two- i mage quasar system 
SDSS J1206-H4332, iParaficz et all (l200l found Hq = 
73i|kms-iMpc-i using singular isothermal ellipsoids 
or spheres to describe the three lens galaxies, where pho- 
tometry was used to place a dditional cons t raints on the 
lens parameters. Recently, iFadelv et al.l ()2009D mod- 
eled the gravitational lens Q0957-I-561 using four dif- 
ferent dark matter density profiles, each with a stel- 
lar component. The lens is embedded in a cluster, 
and the authors constrained the corresponding mass 
sheet using the results of a weak lensing analysis by 
iNakai ima et all (|2009[ ). Assuming a fiat universe with 
J7in = 0.274 and cosmological constant Ha = 0.726, 
they found Hq = 85^1^3 kms^^ Mpc~^, where the prin- 
ciple uncertainties were due to the weakly constrained 
stellar mass-to-light ratio (a manifestation of the radial 
profile degeneracy in the lens model). Imposing con- 
straints from stellar population synthesis models led to 
Ho = 79.3l:|^kms-iMpc-i.i^ 

In a nutshell, most of the recent Hq measurements from 
individual systems assumed isothermal profiles, and ne- 
glected the effects of both 7' and Kext: we interpret the 
significant variation between the Hq estimates in the re- 
cent literature as being due to these model limitations. 
In contrast, our B1608-I-656 analysis explicitly incorpo- 
rates the uncertainties due to our lack of knowledge of 
both 7' and Kext- In fact, a spread of ~ 0.2 in 7' around 
2.0 would give a spread of ^ 40% in Hp for the case s 
where isothermal lenses are assumed (jWucknitzl l2002l ) . 
These in turn are set by a lack of information on the sys- 
tems, either because only two images are formed, or the 
extended source galaxy is not observed. 

Other groups have looked to improve the constraints on 
Hq by combining several lenses together in a joint anal- 
ysis. U sing a sample of 10 time-delay lenses, feaha et al] 
(j2006( ) measured Hq = 72;*;^ kms'^ Mpc^i by model- 
ing the lens' convergence distributions on a grid and us- 
ing the point image posit ions of the le nses as constraints 
(the PixeLens method). IColesI (|2008l ) improved on the 
method and obtained Hq = 71~tt kms""'^ Mpc"'^ while 
addre ssing more clearly their prior assumptions. lOguril 
(|2007D used a sample of 16 time-delay lenses to con- 
strain i?o = 68 ± 6(stat.) ± 8(syst.)kms-iMpc-i (for 
J7m = 0.24 and Ha = 0.76; see footnote fT7|) by employing 
a statistical approach based on the image configurations. 
By simultaneously modeling S DSS J1206-I-4332 wit h four 
other systems using PixeLens, iParaficz et al.l (|200a ) de- 
rive Hq = 61.51^4 kms^^ Mpc~^. The larger quoted error 
bars on these ensemble estimates are perhaps a reflection 
of the paucity of information available for each lens, as 
discussed above. All four analyses effectively assume that 

The corresponding Hq for the K03 cosmology is within 
0.1% of the hsted values. 



the ensemble external convergence distribution h as zero 
mean, which may not be accurate: for example, lOguril 
()2007() constructed a sample for which external conver- 
gence could be neglected, and then incorpo rated this into 
the systematic error budget. Furthermore, lOguril ()2007f ) 
imposed a Gaussian prior on the slope of 7' = 2.00±0.15, 
and the PixeLens method's priors on k may well implic- 
itl y impo s e cons traints on 7 that are similar to the prior 
m lOguril (|2007t ): these priors on the slope may not be 
appropriate for individual systems in the ensembles. 

In contrast, our measurement of 7' from the ACS data 
means that our results are independent of external pri- 
ors on 7'. In fact, our detailed study of the single well- 
observed lens B1608-I-656, even incorporating the effects 
of Koxtj constrains Hq better than the studies using en- 
sembles of lenses. Our claim is that our analysis of the 
systematic effects in B1608-I-656 — explicitly including 
density profile slope and external convergence as nui- 
sance parameters — is one of the most extensive on a 
single lens, and is rewarded with one of the most accu- 
rate measurements of Hq from time-delay lenses. 

8.3. Relaxing the K03 prior 

As we described in Section [21 strong lens time de- 
lays enable a measurement of a cosmological distance- 
like quantity, D\t = (1 -|- Zd)DdDs/Dds- While there is 
some slight further dependence on cosmology in the stel- 
lar dynamics modeling, we expect this particular distance 
combination to be well constrained by the system. To il- 
lustrate this, we plot in Figure |6] the PDF for D^t with 
and without the constraints from B1608+656, for various 
choices of the cosmological parameter prior PDF. Specif- 
ically, we show the effect of relaxing the prior on fim, 
r^A and w from the K03 delta function to the two types 
of uniform distributions detailed in Table [S] "UNIFOR- 
Mopen" and "UNIFORMw" . We see that ah of these 
distributions predict the same uninformative prior for 
DAt, and that the B1608-t-656 posterior PDFs are cor- 
respondingly similar. With the OBS and ACS priors for 
Koxt and 7', we estimate Z^At — (5.16tQ;24) x lO^Mpc, 
a precision of ~ 5%. The difference between the Dai 
estimates among the three priors shown is < 2%. 

Figure [B] suggests that a shifted log normal approxima- 
tion (to take into account the skewness) for the product 
of the B 1608-1-656 likelihood function, marginalized over 
the OBS and ACS priors, is an appropriate compression 
of our results. We find that 



P{DAt\HQ,n^,nA,w) 



/2Tr{x - Ad)o-d 



cxp 



(log(a:: - Ad) - Md)^ 



2al 



, (35) 



where x = I^At/ll Mpc), Ad = 4000., /.to = 7.053 and 
fiD = 0.2282, accurately reproduces the cosmological pa- 
rameter inferences: for example. Bubble's constant is re- 
covered to < 0.7% and its 16*^ and 84"^ percentiles (68% 
CL) are recovered to < 1.1% for the WMAP cosmologies 
we considered. 

8.4. Constraints on and Qa 

Based on the construction of DAt, we expect strong 
lens time delays to be more sensitive to Hq than the 
other three cosmological parameters. This is shown in 
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Fig. 6.— PDFs for Dai, showing the B1608+656 posterior 
constraints on Dai (solid) given assorted uniform priors for the 
cosmological parameters (dotted, labeled). See the text for a full 
description of these various priors. In this figure we assign the 
ACS and OBS priors for 'y' and /^cxt- B 1608+656 provides tight 
constraints on Daii which translates into information about Qm, 
Qa and w as well as _ffo- 

Figure [21 where we consider w = ^1 uniform cosmolog- 
ical prior "UNIFORMopen" and plot the marginalized 
B1608+656 posterior PDF to show the influence of the 
lensing data (blue lines). While there is a slight depen- 
dence on r^A, we see that the B1608-F656 data do indeed 
primarily constrain Hq- In contrast, we plot the posterior 
PDF from the a nalysis of the 5 year WMAP data set (red 
lines, iDunklev e t al. 2009). With no constraint on the 
curvature of space, the CMB data provides only a weak 
prior on Hq, which is highly degenerate with r2,„ and 
Importance sampling the WMAP MCMC chains 
with the B1608+656 likelihood, we obtain the joint pos- 
terior PDF, plotted in black. 

Strong lens time delays are an example of a kinematic 
cosmological probe, i.e., one that is sensitive to the ge- 
ometry and expansion rate of the Universe, but not to 
dynamical assumptions about the the growth of structure 
in the Universe. In Table |31 we compare the B1608-F656 
data set to a number of other kinematic probes from the 
literature. The WMAP data constrain the angular diam- 
eter distance to the last scattering surface; these other 
data sets effectively provide a second distance estimate 
that breaks the degeneracy between Hq and the curva- 
ture of space. In the B1608+656 case, we constrain ilk 
to be -0.005ig;^26 (95% CL). We can see that in terms 
of constraining the curvature parameter, B1608-F656 is 
more informative than the HST Key Project Hq mea- 
surement, and is comparable to the current SNe la data 
set. 

Figure [7] also shows the primary nuisance parameter, 
Koxt- When B1608-F656 and the WMAP data are com- 
bined, the PDF for 

Kext shifts and tightens v ery s lightly, 
as we expect from the discussion in Section 18.11 If we 
relax the OBS prior on Kext to uniform, then we obtain 
-0.032 < < 0.021 (95% CL), which is still tighter 
than the HST KP constraints. 



"WMAP5^^^ 
WMAP5 + HST KPb>= 
WMAP5 + SN^''^ 
WMAP5 + BAG'''' 



-0.285 < < 0.010 15% 

-0.052 < f^k < 0.013 3.3% 

-0.032 < Qk < 0.008 2.0% 

-0.017 < Qk < 0.007 1.2% 



WMAP5 + B1608 -0.031 < Ok < 009 2.0% 

The third column gives the "precision," quantified as half the 
95% confidence interval in (1.0 — f^k), as a percentage. ^ 
http://lambda.gsfc.nasa.gov ^ IKomatsu et al.l I I2009I1 . 
IPreedman et al.l ll200lD. Bas ed on the "union" SN sam ples com- 
piled bv lKowalski et abl l l200l) . ^ IPercival eTaTI 120071 ). 

8.5. Constraints on dark energy 

As noted by many authors (e.g. iHul 120051 : 
IKomatsu et"aLll2009l : iRiess et al.ll2009D . the de generacy- 
breaking shown in the previous subsection can be recast 
as a mechanism for constraining the equation of state of 
dark energy, w. If we assert a precisely flat geometry for 
the Universe, as motivated by the inflationary scenario, 
we can spend our available information on constrain- 
ing w instead. Figure [5] shows the marginalized posterior 
PDF for the cosmological parameter Hq, J^a = 1 — 
and w, along with the nuisance parameter K^xt, again 
comparing the B1608-f656 constraints with uniform 
and WMAP priors, and the WMAP constraints alone. 
With the WMAP data alone, w is strongly degenerate 
with Hq and Q\. Including B1608+656, which mainly 
provides constraints on Hq, the HQ-w-fljy degeneracy is 
partly broken. The resulting marginalized distribution 
gives w = —O.QA'^q II, consistent with a cosmological 
constant. The corresponding value of Bubble's constant 
is Hq = 69.7l5;i]kms-iMpc-i. 

We summarize our inferences of Hq and w in this 
variable-w model in TablelHl comparing to a similar set of 
alternative kinematic probes referred to in the previous 
section. We see that, combining with the WMAP 5 year 
data set and marginalizing over all other parameters, the 
B1608-f 656 data set provides a measurement of Hubble's 
constant with an uncertainty of 6.9%, with the equation 
of state parameter simultaneously constrained to 18%. 
This level of precision is better than that available from 
the HST KP and is competitive with the current BAO 
measurements . 

Our results are consistent with the results from all the 
other probes listed. This is not a trivial statement: com- 
bining each data set with the WMAP 5 year prior allows 
us not only to quantify the relative constraining power of 
each one, it also retains the possibility of detecting incon- 
sistencies between data sets. As it is, it appears that all 
the kinematic probes listed are in agreement within their 
quoted uncertainties. Some tension might be present if 
the supernovae and B1608+656 were considered sepa- 
rately from a combination of local HST Hq measure- 
ments and BAO constraints, but we have no compelling 
reason to make such a division. As the statistical errors 
associated with each probe are decreased, other inconsis- 
tencies may arise: we might expect there to always be a 
need for careful pairwise data set combinations. 

Finally then, we incorporate B1608+656 into a global 
analysis of cosmological data sets. As an example, we im- 
portance sample from the WBSw prior PDF; this is the 
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Fig. 7. — The B1608+656 marginalized posterior PDF for Hq, Qm, f^A a^nd Kcxt in a w = —1 cosmological model and assuming ACS 
7' and OBS /text priors; contours are 68% and 95% confidence levels. The three sets of colored contours correspond to three different 
prior/data set combinations. Blue: Bf608+656 constraints, given the UNIFORMopen prior; red: the prior provided by the WMAP 5 year 
data set alone; black: the joint constraints from combining WMAP and B1608+656. The blue contours in the Qui and columns are 
omitted since they would show almost no constraints, as indicated by the diagonal panels. 



joint posterior PDF from the joint analysis of WMAP5, 
BAO and SN data. This prior is already very tight, 
characterized by a median and 68% confidence limits of 
Hq = 70.3j;i[ g kms~^ Mpc~^. When we include infor- 
mation from B1608+656 with the ACS 7' and OBS Koxt 
priors, we obtain Hq = 70.4^J'4 kms~^ Mpc~^, a slight 
shift in centroid and 6% reduction in the confidence in- 
terval. This is good as it shows global consistency in the 
WMAP5, BAO, SN and B1608-I-656 data sets. 

8.6. Future prospects 

In this paper, we have studied a single strong gravita- 
tional lens, B1608-I-656, investigating in depth the vari- 
ous model parameter degeneracies and systematic effects. 
At present, B1608-f 656 remains the only strong lens sys- 
tem with (i) time delay measurements with errors of only 
a few percent, and (ii) extended source surface bright- 
ness distribution for accurate lens modeling; as we have 
shown, these two properties together enable the careful 
study and the resulting tight constraint on Hq. 

Table [5] shows that even this one system provides com- 
petitive accuracy on Hq and w for a single kinematic 
probe, especially when we consider that all the other ex- 
periments involved averaging together many independent 



distance measurements. What should we expect from ex- 
tending this study to many more lenses? As we showed 
in Section 18.11 if the data are good enough to constrain 
the density profile slope to a few percent, the accuracy 
of the cosmological parameter inference is limited, as it 
is in B 1608-1-656, by our knowledge of the lens environ- 
ment, Koxt- 

However, we also outlined in Section [5] how using infor- 
mation from numerical simulations and the photometry 
in the field can be used to constrain this nuisance param- 
eter and yield an unbiased estimate of Hq. Furthermore, 
as discussed in Section [8Tl stellar dynamics provides sig- 
nificant amount of information on Koxt by limi ting its per- 
missib le ra nge of values, \y hilc we, and also iTreu et al.l 
(|2009[ ) and iFassnacht et al.l ([2009) , discuss how the line- 
of-sight contributions to Koxt should average to zero over 
many lens systems, lens galaxies — like all massive galax- 
ies — tend to live in locally overdense environments, 
such that the local contribution to Kext would be non- 
zero. Careful studies of the lens environm ents (e.g. 
iMomcheva et al.ll2006l : IFassnacht eralll2006at Blandford 
et al. in preparation) and of N-body simulations with gas 
physics to determine this local contribution to Kcxt will 
be crucial for obtaining Hq from a large sample of lenses. 
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Fig. 8. — The B1608+656 marginalized posterior PDF for Ho, Ha, w and Kcxt in a fiat cosmological model, again assuming ACS 7' and 
OBS Kcxt priors; contours are 68% and 95% confidence levels. The three sets of colored contours correspond to three different prior/data 
set combinations. Blue: B1608+656 constraints, given the UNIFORMw prior; red: the prior provided by the WMAP 5 year data set alone; 
black: the joint constraints from combining WMAP and B1608+656. The blue contours in the f2m and Qa columns are omitted since they 
would show almost no constraints, as indicated by the diagonal panels. 



TABLE 5 

Dark energy constraints from WMAP5 combined with various data sets, assuming flat geometry. 





Ho/kms-i Mpc-i 




w 




WMAP5='''^ 


74+15 

'^-14 


20% 


-1 06+"-'*^ 


42% 


WMAP5+HST KP'^'^'" 


72 


10% 




23% 


WMAPS+SN^''^-'* 




2.3% 


' -0.064 


6.5% 


WMAPS+BAO'^''^''^ 


73 g+4> 
'•J-»_4.g 


6.6% 


-1 15+0-21 


22% 


WMAPS+Riess' 


74.2 ± 3.6S 


5.0% 


-1.12 ± 0.12 


12% 


WMAP5+B1608 




6.9% 




18% 



The "precisions" in the third and fifth columns are defined as half the 68% confidence interval, as a percentage of either 72 for Hq or -1.0 
for w. ^ http://lambda.gsfc.nasa.gov [Komatsu ct al. (2009). The Hq estimate was taken from the previously listed website. 
IFreedman et al.l 1 I2OOII) . Based on the "union" SN samples compiled bv lKowalski et al.l 120081 '). " IPercival eTall l l2007t) . ^ IRiess et all 
II2009I ). s not marginalized over other cosmological parameters. 

If we are able to average together N systems we should, 
in principle, be able to reduce our uncertainty by \/N . 
In practice, the accuracy of the combination procedure 
will sooner be limited by the systematic uncertainty in 
the shape and centroid of the assumed Koxt distribution: 
investigating the properties of this distribution is per- 
haps the most urgent topic for further work. Likewise, if 
the density profile slope cannot be constrained for each 
time-delay lens individually, the details of the prior PDF 



assigned for 7' will become important as the ensemble 
grows. 

In the near future, cadenced surveys such as those 
planned with the Large Synoptic Survey Telescope 
(LSST) and being undertaken by the Panoramic Survey 
Telescope and Rapid Response System (Pan-STARRS) 
will discover large numbers of time-delay lenses, prompt- 
ing us to consider performing analyses such as the one 
described here on hundreds of lens systems. In prac- 
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tice, obtaining data of the quality we have presented 
here for hundreds of suitable lenses will p ose a signifi- 
cant o b servational c hallenge. Nevertheless, iDobke et alj 
(|2009D . ICoe fc Mou stakas (2009) and Oguri & Marshall 
(in preparation) investigate constraints on cosmological 
parameters b a.sed on large samp l es of t ime-delay lenses. 
In particular, ICoe fc MoustakasI ()2009D suggest that, in 
terms of raw precision and in combination with a prior 
PDF from Planck, an LSST ensemble could reach sub- 
percent level precision in Hq, and constrain w to 3% or 
better, provided that the systematic effects such as Koxt 
are under control. Our work has already addressed some 
of these systematic effects, and will provide a basis for 
future analysis of large samples of time-delay lenses and 
lens environment studies. 

9. CONCLUSIONS 

We have studied the well-observed gravitational lens 
B1608-I-656 and used it to infer the values of cosmologi- 
cal parameters; we outlined and followed a Bayesian ap- 
proach for combining three data sets: HST/ ACS imag- 
ing, stellar velocity dispersion measurement, and the 
time delays between the multiple images. Diagnosing 
the principal systematic effects, we included two nui- 
sance parameters (7' and ^cxt) into the data model to 
account for them, assigning well-motivated prior PDFs 
and marginalizing over them. We draw the following 
conclusions: 

• We find that the HST/ ACS images constrain the 
density profile slope parameter 7' ~ 2.08 ± 0.03, 
which we propagate through the cosmological pa- 
rameter inference as a prior PDF. Relaxing this 
prior to a uniform distribution degrades the preci- 
sion on Ho from 4.4% to 7.0%; the SLAGS intrinsic 
profile slope parameter distribution is not signifi- 
cantly more informative than the uniform prior. 

• With the ACS prior for 7', we find that the inferred 
cosmological parameters are dominated by the the 
external convergence Kext- Ray-tracing through the 
Millennium Simulation gives a PDF for Kext due 
to line-of-sight contributions that has zero mean 
and width ^ 0.04, while using the galaxy number 
counts in the B16084-656 field in conjunction with 
the MS gives Kext 0.10t°:|]^. 

Using our most informative priors on the two nuisance 
parameters, we arrive at the following cosmographic in- 
ferences: 

• In the K03 cosmology (fim — 0.3, f^A — 0.7, 
w = —1, and uniform Hq), we obtain from the 
B1608-F656 data set Hq = 70.6±3.1kms-iMpc-i 
(68% CL). The 4.4% error includes both statisti- 
cal and dominant systematic uncertainties, through 
the marginalization described above. This is a sig- 
nificant improvement to the earl ier measurement 
of Hq = 75j;g kms~^ Mpc~^ bv iKoopmans et aD 
(l2003h . 

• Time-delay lenses are sensitive primarily to Hq but 
are weakly dependent on other cosmological pa- 
rameters; the lensing measurement of Hq is robust 
and useful for studying dark energy when combined 



with other cosmological probes. We find that for 
B1608-I-656 the cosmographic information can be 
summarized as a shifted log normal probability dis- 
tribution for the time-delay distance I?At in units 
of Mpc, with the three parameters Ad = 4000., 
^1D = 7.053 and ctd = 0.2282. 

• In a A-CDM cosmology (with w — —1), the 
B1608-I-656 data set breaks the degeneracy be- 
tween fim and Ha in the WMAP 5 year data set, 
and constrains the curvature parameter to be zero 
to 2.0% (95% CL), a level of precision similar to 
those afforded by the current Type la SNe sample. 

• B1608-H656 in combination with the WMAP 5 
year data set, assuming flatness and allowing 
(a time-independent) w to vary, gives Hq = 
69.7t^;^kms^iMpc-i and w = -OMtoH (68% 
CL). 

These are significant improvements to the WMAP5 
only constraints of Hq = 74~^\^ kms~^ Mpc^^ and 

w — — 1.06j^Q;42. B1608-I-656 is as competitive as 
the current BAO data in determining w when com- 
bined with WMAP5. 

Our detailed analysis of B1608-f656 provides the 
framework for using large samples of time-delay lenses 
as cosmological probes in the near future. We anticipate 
the local contribution to Kcxt, which would not average 
away with a large sample of lenses, being the dominant 
residual systematic error. Several lens environment stud- 
ies to circumvent this are underway; with the effects from 
Koxt accurately modeled, future samples of time-delay 
gravitational lenses should be a competitive cosmologi- 
cal probe. 

We thank M. Bradac, J. Hartlap, E. Komatsu, 
J. P. McKean and P. Schneider for useful discussions. We 
are grateful to the anonymous referee whose suggestions 
and comments helped clarify parts of the paper. S.H.S. 
is supported in part through the Deutsche Forschungsge- 
meinschaft under the project SCHN 342/7-1. C.D.F. ac- 
knowledge support under the T/STprogram #GO-10158. 
Support for program #GO-10158 was provided by NASA 
through a grant from the Space Telescope Science In- 
stitute, which is operated by the Association of Uni- 
versities for Research in Astronomy, Inc., under NASA 
contract NAS 5-26555. C.D.F. acknowledge the support 
from the European Community's Sixth Framework Marie 
Curie Research Training Network Programme, contract 
no. MRTN-CT-2004-505183 "ANGLES." R.D.B. ac- 
knowledges support through NSF grant AST 05-07732. 
L.V.E.K. is supported in part through an NWO-VIDI 
career grant (project number 639.042.505). T.T. ac- 
knowledges support from the NSF through CAREER 
award NSF-0642621, by the Sloan Foundation through 
a Sloan Research Fellowship, and by the Packard Foun- 
dation through a Packard Fellowship. This work was sup- 
ported in part by the NSF under award AST-0444059, 
the TABASCO foundation in the form of a research fel- 
lowship (P.J.M.), and by the US Department of Energy 
under contract number DE-AC02-76SF00515. Based in 
part on observations made with the NASA/ESA Hubble 



22 



Suyu et al. 



Space Telescope^ obtained at the Space Telescope Science 
Institute, which is operated by the Association of Uni- 
versities for Research in Astronomy, Inc., under NASA 



contract NAS 5-26555. These observations are associated 
with program #GO-10158. 



REFERENCES 



Auger, M. W., Fassnacht, C. D., Abrahamse, A. L., Lubin, L. M., 

& Squires, G. K. 2007, AJ, 134, 668 
Auger, M. W., Fassnacht, C. D., Wong, K. C, Thompson, D., 

Matthews, K., & Soifer, B. T. 2008, ApJ, 673, 778 
Barkana, R. 1998, ApJ, 502, 531 

Barnabe, M., Czoske, O., Koopmans, L. V. E., Treu, T., Bolton, 
A. S., & Gavazzi, R. 2009, MNRAS, 399, 21 

Barnabe, M. & Koopmans, L. V. E. 2007, ApJ, 666, 726 

Binney, J. & Tremaine, S. 1987, Galactic dynamics (Princeton, NJ, 
Princeton University Press, 1987, 747 p.) 

Bonamente, M., Joy, M. K., LaRoque, S. J., Carlstrom, J. E., 
Reese, E. D., & Dawson, K. S. 2006, ApJ, 647, 25 

Browne, I. W. A., Wilkinson, P. N., Jackson, N. J. P., Myers, S. T., 
Fassnacht, C. D., Koopmans, L. V. E., Marlow, D. R., Norbury, 
M., Rusin, D., Sykes, C. M., Biggs, A. D., Blandford, R. D., de 
Bruyn, A. G., Chae, K.-H., Hclbig, P., King, L. J., McKean, 
J. P., Pearson, T. J., Phillips, P. M., Rcadhead, A. C. S., 
Xanthopoulos, E., & York, T. 2003, MNRAS, 341, 13 

Coe, D. & Moustakas, L. A. 2009, ApJ, 706, 45 

Coles, J. 2008, ApJ, 679, 17 

Czoske, O., Barnabe, M., Koopmans, L. V. E., Treu, T., & Bolton, 
A. S. 2008, MNRAS, 384, 987 

De Lucia, G. & Blaizot, J. 2007, MNRAS, 375, 2 

Dobkc, B. M., King, L. J., Fassnacht, C. D., & Auger, M. W. 2009, 
MNRAS, 397, 311 

Dunkley, J., Komatsu, E., Nolta, M. R., Spcrgcl, D. N., Larson, 
D., Hinshaw, G., Page, L., Bcimctt, C. L., Gold, B., Jarosik, N., 
Wciland, J. L., Halpern, M., Hill, R. S., Kogut, A., Limon, M., 
Meyer. S. S., Tucker, G. S., WoUack, E., & Wright, E. L. 2009, 
ApJS, 180, 306 

Dye, S., Evans, N. W., Bclokurov, V., Warren, S. J., & Hewett, P. 

2008, MNRAS, 388, 384 
Dye, S. & Warren, S. J. 2005, ApJ, 623, 31 

Eiscnstcin, D. J., Zehavi, I., Hogg, D. W., Scoccimarro, R., 
Blanton, M. R., Nichol, R. C, Scranton, R., See, H.-J., 
Tegmark, M., Zheng, Z., Anderson, S. F., Annis, J., Bahcall, 
N., Brinkmaim, J., Buries, S., Castandcr, F. J., Connolly, A., 
Csabai, I., Doi, M., Fukugita, M., Fricman, J. A., Glazebrook, 
K., Gunn, J. E., Hendry, J. S., Hcimessy, G., Ivezic, Z., Kent, S., 
Knapp, G. R., Lin, H., Loh, Y.-S., Lupton, R. H., Margon, B., 
McKay, T. A., Meiksin, A., Munn, J. A., Pope, A., Richmond, 
M. W., Schlegel, D., Schneider, D. P., Shimasaku, K., Stoughton, 

C, Strauss, M. A., SubbaRao, M., Szalay, A. S., Szapudi, L, 
Tucker, D. L., Yanny, B., & York, D. G. 2005, ApJ, 633, 560 

Fadely, R., Keeton, C. R., Nakajima, R., & Bernstein, G. M. 2009, 

ArXiv e-prints (0909.1807) 
Falco, E. E., Gorenstein, M. V., & Shapiro, L I. 1985, ApJ, 289, 

LI 

Fassnacht, C. D., Gal, R. R., Lubin, L. M., McKcan, J. P., Squires, 

G. K., & Rcadhead, A. C. S. 2006a, ApJ, 642, 30 
Fassnacht, C. D., Koopmans, L. V. E., & Wong, K. C. 2009, ArXiv 

e-prints (0909.4301) 
Fassnacht, C. D., McKean, J. P., Koopmans, L. V. E., Treu, T., 

Blandford, R. D., Auger, M. W., Jeltema, T. E., Lubin, L. M., 

Margoniner, V. E., & Wittman, D. 2006b, ApJ, 651, 667 
Fassnacht, C. D., Pearson, T. J., Rcadhead, A. C. S., Browne, 

L W. A., Koopmans, L. V. E., Myers. S. T., & Wilkinson, P. N. 

1999, ApJ, 527, 498 
Fassnacht, C. D., Womble, D. S., Neugebauer, G., Browne, L W. A., 

Rcadhead, A. C. S., Matthews, K., & Pearson, T. J. 1996, ApJ, 

460, L103 

Fassnacht, C. D., Xanthopoulos, E., Koopmans, L. V. E., & Rusin, 

D. 2002, ApJ, 581, 823 

Freedman, W. L., Madore, B. F., Gibson, B. K., Fcrrarese, L., 
Kelson, D. D.. Sakai, S., Mould, J. R., Kcimicutt, Jr., R. C, 
Ford, H. C, Graham, J. A., Huchra, J. P., Hughes, S. M. G., 
lUingworth, G. D., Macri, L. M., & Stetson, P. B. 2001, ApJ, 
553, 47 

Gavazzi, R., Treu, T., Koopmans, L. V. E., Bolton, A. S., 
Moustakas, L. A., Buries, S., & Marshall, P. J. 2008, ApJ, 677, 
1046 



Gavazzi, R., Treu, T., Rhodes, J. D., Koopmans, L. V. E., Bolton, 

A. S., Buries, S., Massey, R. J., & Moustakas, L. A. 2007, ApJ, 

667, 176 

Gebhardt, K., Richstone, D., Tremaine, S., Lauer, T. R., Bender, 
R., Bower, G., Dressier, A., Faber, S. M., Filippenko, A. V., 
Green, R., Grillmair, C, Ho, L. C, Kormendy, J., Magorrian, 
J., & Pinkney, J. 2003, ApJ, 583, 92 

Grogin, N. A. & Narayan, R. 1996a, ApJ, 464, 92 

— . 1996b, ApJ, 473, 570 

Hernquist, L. 1990, ApJ, 356, 359 

Herrnstein, J. R., Moran, J. M., Greenhill, L. J., Diamond, P. J., 

Inoue, M., Nakai, N., Miyoshi, M., Henkel, C., & Riess, A. 1999, 

Nature, 400, 539 
Hicken, M., Wood-Vasey, W. M., Blondin, S., Challis, P., Jha, S., 

Kelly, P. L., Rest, A., & Kirshner, R. P. 2009, ApJ, 700, 1097 
Hilbcrt, S., Hartlap, J., White, S. D. M., & Schneider, P. 2009, 

A&A, 499, 31 

Hilbcrt, S., White, S. D. M., Hartlap, J., &; Schneider, P. 2007, 

MNRAS, 382, 121 
— . 2008, MNRAS, 386, 1845 
Hu, W. 2005, 339, 215 

Humphrey, P. J. & Buote, D. A. 2009, ArXiv e-prints (0911.0678) 
Jaffe, W. 1983, MNRAS, 202, 995 

Jakobsson, P., Hjorth, J., Burud, L, Letawc, G., Lidman, C, & 

Courbin, F. 2005, A&A, 431, 103 
Keeton, C. R. & Zabludoff, A. I. 2004, ApJ, 612, 660 
Kirshner, R. P. & Kwan, J. 1974, ApJ, 193, 27 
Kochanek, C. S. 2002, ApJ, 578, 25 

Komatsu, E., Dunkley, J., Nolta, M. R., Bennett, C. L., Gold, 

B. , Hinshaw, G., Jarosik, N., Larson, D., Limon, M., Page, L., 
Spergel, D. N., Halpern, M., Hill, R. S., Kogut, A., Meyer, S. S., 
Tucker, G. S., Wciland, J. L., WoUack, E., & Wright, E. L. 2009, 
ApJS, 180, 330 

Koopmans, L. V. E., Bolton, A., Treu, T., Czoske, O., Auger, 
M. W., Barnabe, M., Vcgctti, S., Gavaazi, R., MoustaJcas, L. A., 
& Buries, S. 2009, ApJ, 703, L51 

Koopmans, L. V. E. & Treu, T. 2002, ApJ, 568, L5 

Koopmans, L. V. E., Treu, T., Bolton, A. S., Buries, S., & 
Moustakas, L. A. 2006, ApJ, 649, 599 

Koopmans, L. V. E., Treu, T., Fassnacht, C. D., Blandford, R. D., 
& Surpi, G. 2003, ApJ, 599, 70 

Kowalski, M., Rubin, D., Aldering, G., Agostinho, R. J., Amadon, 
A., AmanuUali, R., Balland, C., Baxbary, K., Blanc, G., Challis, 
P. J., Conley, A., Connolly, N. V., Covarrubias, R., Dawson, 
K. S., Deustua, S. E., Ellis, R., Fabbro, S., Fadeyev, V., Fan, 
X., Farris, B., Folatelli, G., Frye, B. L., Garavini, G., Gates, 
E. L., Germany, L., Goldhaber, G., Goldman, B., Goobar, A., 
Groom, D. E., Haissinski, J., Hardin, D., Hook, I., Kent, S., 
Kim, A. G., Knop, R. A., Lidman, C, Lindcr, E. V., Mendez, J., 
Meyers, J., Miller, G. J., Monicz, M., Mourao, A. M., Newberg, 
H., Nobili, S., Nugent, P. E., Pain, R., Perdereau, O., Perlmutter, 
S., Phillips, M. M., Prasad, V., Quimby, R., Regnault, N., 
Rich, J., Rubcnstcin, E. P., Ruiz-Lapuente, P., Santos, F. D., 
Schaefer, B. E., Schommer, R. A., Smith, R. C, Sodcrbcrg, 
A. M., Spadafora, A. L., Strolger, L.-G., Strovink, M., Suntzeff, 
N. B., Suzuki, N., Thomas, R. C, Walton, N. A., Wang, L., 
Wood-Vasey, W. M., & Yun, J. L. 2008, ApJ, 686, 749 

Kronawittcr, A., Saglia, R. P., Gerhard, O., & Bender, R. 2000, 
A&AS, 144, 53 

Lewis, A. & Bridle, S. 2002, Phys. Rev. D, 66, 103511 

MacKay, D. 2003, Information Theory, Inference and Learning 
Algorithms (Cambridge: CUP) 

Macri, L. M., Stanek, K. Z., Bersicr, D.. Greenhill, L. J., & Reid, 
M. J. 2006, ApJ, 652, 1133 

McKean, J. P., Auger, M. W., Koopmans, L. V. E., Vegetti, S., 
Czoske, O., Fassnacht, C. D., Treu, T., More, A., & Kocevski, 
D. D. 2009, ArXiv e-prints (0910.1133) 

Merritt, D. 1985, AJ, 90, 1027 

Merritt, D. & Aguilar, L. A. 1985, MNRAS, 217, 787 
Momchcva, I., WilUams, K., Keeton, C, & Zabludoff, A. 2006, 
ApJ, 641, 169 



Cosmological constraints from gravitational lens B1608+656 



23 



Myers, S. T., Fassnacht, C. D., Djorgovski, S. G., Blandford, 
R. D., Matthews, K., Neugebauer, G., Pearson, T. J., Readhead, 
A. C. S., Smith, J. D., Thompson, D. J., Womble, D. S., Browne, 
I. W. A., Wilkinson, P. N., Nair, S., Jackson, N., Snellen, I. A. G., 
Miley, G. K., de Bruyn, A. G., & Schilizzi, R. T. 1995, ApJ, 447, 
L5 

Myers, S. T., Jackson, N. J., Browne, I. W. A., de Bruyn, A. G., 
Pearson, T. J., Readhead, A. C. S., Wilkinson, P. N., Biggs, 
A. D., Blandford, R. D., Fassnacht, C. D., Koopmans, L. V. E., 
Marlow, D. R., McKean, J. P., Norbury, M. A., Phillips, P. M., 
Rusin, D., Shepherd, M. C., & Sykes, C. M. 2003, MNRAS, 341, 
1 

Nakajima, R., Bernstein, G. M., Fadely, R., Keeton, C. R., & 

Schrabback, T. 2009, ApJ, 697, 1793 
Oguri, M. 2007, ApJ, 660, 1 

Oke, J. B., Cohen, J. G., Carr, M., Cromer, J., Dingizian, A., 
Harris, F. H., Labrecque, S., Lucinio, R., Schaal, W., Epps, H., 
& Miller, J. 1995, PASP, 107, 375 
Osipkov, L. P. 1979, Pis ma Astronomicheskii Zhurnal, 5, 77 
Paraficz, D., Hjorth, J., & Eh'asdottir, A. 2009, A&A, 499, 395 
Percival, W. J., Cole, S., Eisenstein, D. J., Nichol, R. C, Peacock, 

J. A., Pope, A. C, & Szalay, A. S. 2007, MNRAS, 381, 1053 
Perlmutter, S., Aldering, G., Goldhaber, G., Knop, R. A., Nugent, 
P., Castro, P. G., Deustua, S., Fabbro, S., Goobar, A., Groom, 
D. E., Hook, I. M., Kim, A. G., Kim, M. Y., Lee, J. C, Nunes, 
N. J., Pain, R., Pennypacker, C. R., Quimby, R., Lidman, 
C, Ellis, R. S., Irwin, M., McMahon, R. G., Ruiz-Lapuente, 
P., Walton, N., Schaefer, B., Boyle, B. J., Filippenko, A. V., 
Matheson, T., Fruchter, A. S., Panagia, N., Newberg, H. J. M., 
Couch, W. J., & The Supernova Cosmology Project. 1999, ApJ, 
517, 565 

Refsdal, S. 1964, MNRAS, 128, 307 

Riess, A. G., Filippenko, A. V., Challis, P., Clocchiatti, A., Diercks, 
A., Garnavich, P. M., Gilliland, R. L., Hogan, C. J., Jha, S., 
Kirshner, R. P., Leibundgut, B., Phillips, M. M., Reiss, D., 
Schmidt, B. P., Schommer, R. A., Smith, R. C, Spyromilio, J., 
Stubbs, C, Suntzeff, N. B., & Tonry, J. 1998, AJ, 116, 1009 

Riess, A. G., Macri, L., Casertano, S., Sosey, M., Lampcitl, H., 
Ferguson, H. C, Filippenko, A. V., Jha, S. W., Li, W., Chornock, 
R., & Sarkar, D. 2009, ApJ, 699, 539 

Saha, P., Coles, J., Maccio, A. V., & Wilhams, L. L. R. 2006, ApJ, 
650, L17 

Schmidt, B. P., Kirshner, R. P., Eastman, R. G., Phillips, M. M., 
Suntzeff, N. B., Hamuy, M., Maza, J., & Aviles, R. 1994, ApJ, 
432, 42 



Schneider, P., Kochanek, C. S., & Wambsganss, J. 2006, 
Gravitational Lensing: Strong, Weak and Micro (Springer) 

Sivia, D. S. 1996, Data Analysis: A Bayesian Tutorial (Oxford: 
OUP) 

Spergel, D. N., Bean, R., Dore, O., Nolta, M. R., Bennett, C. L., 
Dunkley, J., Hinshaw, G., Jarosik, N., Komatsu, E., Page, L., 
Peiris, H. V., Verde, L., Halpern, M., Hill, R. S., Kogut, A., 
Limon, M., Meyer, S. S., Odegard, N., Tucker, G. S., Weiland, 
J. L., Wollack, E., & Wright, E. L. 2007, ApJS, 170, 377 
Springel, V., White, S. D. M., Jenkins, A., Frenk, C. S., Yoshida, 
N., Gao, L., Navarro, J., Thacker, R., Croton, D., Helly, J., 
Peacock, J. A., Cole, S., Thomas, P., Couchman, H., Evrard, 
A., Colberg, J., & Pearce, F. 2005, Nature, 435, 629 
Stiavelli, M. & Sparke, L. S. 1991, ApJ, 382, 466 
Sunyaev, R. A. & Zel'dovich, I. B. 1980, ARA&A, 18, 537 
Suyu, S. H., Marshall, P. J., Blandford, R. D., Fassnacht, C. D., 
Koopmans, L. V. E., McKean, J. P., & Treu, T. 2009, ApJ, 691, 
277 

Suyu, S. H., Marshall, P. J., Hobson, M. P., & Blandford, R. D. 

2006, MNRAS, 371, 983 
Tammann, G. A. 1979, in NASA Conference Publication, Vol. 2111, 

NASA Conference Publication, 263-293 
Tonry, J. L. & Franx, M. 1999, ApJ, 515, 512 

Treu, T., Gavazzi, R., Gorecki, A., Marshall, P. J., Koopmans, 
L. V. E., Bolton, A. S., Moustakas, L. A., & Buries, S. 2009, 
ApJ, 690, 670 

Treu, T., Koopmans, L. V., Bolton, A. S., Buries, S., & Moustakas, 

L. A. 2006, ApJ, 640, 662 
Treu, T. & Koopmans, L. V. E. 2002, MNRAS, 337, L6 
van der Marel, R. P. 1994, MNRAS, 270, 271 

Vuissoz, C, Courbin, F., Sluse, D., Meylan, G., Chantry, V., 
Eulaers, E., Morgan, C, Eyler, M. E., Kochanek, C. S., Coles, 
J., Saha, P., Magain, P., & Falco, E. E. 2008, A&A, 488, 481 

Vuissoz, C, Courbin, F., Sluse, D., Meylan, G., Ibrahimov, M., 
Asfandiyarov, L, Stoops, E., Eigenbrod, A., Le Guillou, L., van 
Winckel, H., & Magain, P. 2007, A&A, 464, 845 

Wucknitz, O. 2002, MNRAS, 332, 951 

Wucknitz, C, Biggs, A. D., & Browne, L W. A. 2004, MNRAS, 
349, 14 

York, T., Jackson, N., Browne, I. W. A., Wucknitz, O., & Skelton, 
J. E. 2005, MNRAS, 357, 124 



APPENDIX 

A. PROBABILITY THEORY FOR MEASURING COSMOLOGICAL PARAMETERS 

In this appendix, we describe how we derive the expressions for the likelihoods of the time delay and the ACS 
data sets stated in Section [331 We also provide the details on the sampling techniques for calculating the posterior 
probability density of cosmological parameters. 

A.l. Simplification of the likelihoods 

For B1608+656, we can simplify the marginalization of the likelihoods P{A.t\^) and P{d\^) in Equation ([M]) based 
on the following facts, which are either from Paper I or shown in Section |4| 

• from Paper I, the top Md models led to equal evidence values P{d\j' , rj, dipyip, TWd) (within the uncertainties). 

• in Section SJ the likelihood function P(At|4) is approximately constant for these top models for a given 
cosmology. 

• in SectionlH for good data models Mu of B1608+656, the potential corrections S^p do not change significantly the 
predicted values of the Fermat potential, i.e., the simply-parametrized SPLE initial model provides an unbiased 
estimator for the Fermat potential. 

• the likelihood P(cr|^) is also constant for the various Mu because the dynamics modeling is independent of the 
lensed image processing models M-q. 

• simulations suggest that the potential corrections are sharply peaked about the most probable values <^'0mp • 

With the above results, the Fermat potential can be more easily computed from the SPLE model: there is a strong 
correlation between the A0 and 7', and we obtain the relation between and 7' by evaluating them at several 
discrete 7' values and interpolating between them. For notational simplicity, we consequently drop the nearly-true 
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independences of A(/) on r) and Sip. In computing the predicted A0, the average source position of the four mapped 
(via the lens equation) image positions on the source plane is used. Denoting the dependence of A(j) on 7' for Koxt — 
and a given as (7(7', A^d) and using the mass-sheet degeneracy relation for the dependence of A(j) on K^xt, we 
obtain Equations ((25|) through (p7| for the likelihood of the time delay data.^* 

For the likelihood of the ACS data, we work with the SPLE potential model on the basis that the Fermat potential 
is insensitive to the potential corrections and the potential corrections only slightly alter the ranking of Md models. 
We can choose the Mu to be one of the top models, say M5 (since the normalization in P(7r| Ai, d, a) is irrelevant), 
and drop the dependence on Sip in the likelihood of the ACS data to simplify part of the integrand in Equation (|24)) : 



J dSip ds dMo P{d\-f', T], Sip, s, Md) • 

P{s\X,g)P{Mj,)P{SiP) 
-Jds P(d|7',r7,s,MD = M5)P(s|A,g) 

:P(d|7',r,,MD=M5), (Al) 
which is als o Equation (1291). I n deriving the above equation, we assume that the representative set of models Md 



obtained in ISuvu et "all ( 2009t ) are equally probable a priori (i.e., the prior P{Md) is constant). For the priors 



P(s|A,g) and P{Sip), we use quadratic forms of the regularizing function. Specific ally, we try zeroth -ord er, gradient 
and cu rvature forms for P(s| A, g), and the curvature form for P{Sip), as described in lSuvu et al.l (|2006[ ) and lSuvu et al.l 
(|2009f ). As a reminder, the quantity P{d\j' ,7], Md = M5) is the Bayesian evidence from source reconstr uction given 
the le ns model parameters {7', ry} and the data model M5; this evidence value is calculable based on ISuyu et al.l 
(I2n09ll . 

A. 2. Importance sampling 

In prac tice, we ca n incorporate the various likelihoods in Equation (j33p by importance sampling the prior distribution 
(see e.g. iLewis &:~B ridlc 2002, for an introduction). This is a method for calculating integrals over a PDF P2 when 
all we have is samples drawn from some other PDF Pi. Consider the expectation value of a parameter x: 



(x)2= / x-P2{x)dx, (A2) 

P2(X) 



Pi(x) 



Pi{x)dx. (A3) 



The process of weighting the samples from Pi by the ratio P2 (x) / Pi (x) is called importance sampling. It works most 
efficiently when Pi and P2 are quite similar, and fails if Pi is zero- valued over some of the range of P2, or if the 
sampling of Pi is too sparse. 

In our case, we would like to calculate integrals over, for example, P2 — P(7r, 7', Kextj ''anil^i, f), while the prior 
is written simply Pi — P(7r, 7', Kcxt, fiini\d) (recall that d is used to provide a prior on 7'). Using Bayes' theorem, we 
can write 

P(7r,7',Kcxt,rani|At, d, Cr) CX 

P(At, Cr|7r, 7', Koxt, rani)P(7r, 7', Kcxt, ^anilrf), 
i.e. P2 CX P(At,Cr|7r,7',Kext,?'ani)-Pl- (A4) 

From this we can see that the weight we must attach to each sample from the prior is just the value of the likelihood 
P(A/;, ctItt, 7', Kexti ''ani)- Note that these weights can be rescaled by an arbitrary factor, which can be important in 
retaining numerical stability. 

We apply this technique to perform the marginalization in Equation p3p . Specifically, we have samples of P(7r), 
P{'y'\d, Md = -^5), P(Kcxt) and P(?'ani), and employ importance sampling to obtain P(7r| At, d, cr). 



The Fermat potential is independent of cosmological parameters because we work in terms of the scaled lens potential in angular 
(arcsecond) units. Cosmological parameters and rcdshifts are only needed for deriving physical quantities of the lens system, such as 
the one-dimensional velocity dispersion (calculable from the Einstein radius) of the lens, mass of the lens, and the physical extent of the 
lens/source. 



